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Preface 


Partial  differential  equations  (PDE)  first  appeared  over  300  years  ago,  and  the  vast 
scope  of  the  theory  and  applications  that  have  since  developed  makes  it  challenging 
to  give  a  reasonable  introduction  in  a  single  semester.  The  modem  mathematical 
approach  to  the  subject  requires  considerable  background  in  analysis,  including 
topics  such  as  metric  space  topology,  measure  theory,  and  functional  analysis. 

This  book  is  intended  for  an  introductory  course  for  students  who  do  not  nec¬ 
essarily  have  this  analysis  background.  Courses  taught  at  this  level  traditionally 
focus  on  some  of  the  more  elementary  topics,  such  as  Fourier  series  and  simple 
boundary  value  problems.  This  approach  risks  giving  students  a  somewhat  narrow 
and  outdated  view  of  the  subject. 

My  goal  here  is  to  give  a  balanced  presentation  that  includes  modern  methods, 
without  requiring  prerequisites  beyond  vector  calculus  and  linear  algebra.  To  allow 
for  some  of  the  more  advanced  methods  to  be  reached  within  a  single  semester,  the 
treatment  is  necessarily  streamlined  in  certain  ways.  Concepts  and  definitions  from 
analysis  are  introduced  only  as  they  will  be  needed  in  the  text,  and  the  reader  is 
asked  to  accept  certain  fundamental  results  without  justification.  The  emphasis  is 
not  on  the  rigorous  development  of  analysis  in  its  own  right,  but  rather  on  the  role 
that  tools  from  analysis  play  in  PDE  applications. 

The  text  generally  focuses  on  the  most  important  classical  PDE,  which  are  the 
wave,  heat,  and  Laplace  equations.  Nonlinear  equations  are  discussed  to  some 
extent,  but  this  coverage  is  limited.  (Even  at  a  very  introductory  level,  the  nonlinear 
theory  merits  a  full  course  to  itself.) 

I  have  tried  to  stress  the  interplay  between  modeling  and  mathematical  analysis 
wherever  possible.  These  connections  are  vital  to  the  subject,  both  as  a  source  of 
problems  and  as  an  inspiration  for  the  development  of  methods. 
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Chapter  1 

Introduction 


1.1  Partial  Differential  Equations 

Continuous  phenomena,  such  as  wave  propagation  or  fluid  flow,  are  generally  mod¬ 
eled  with  partial  differential  equations  (PDE),  which  express  relationships  between 
rates  of  change  with  respect  to  multiple  independent  variables.  In  contrast,  phenom¬ 
ena  that  can  be  described  with  a  single  independent  variable,  such  as  the  motion 
of  a  rigid  body  in  classical  physics,  are  modeled  by  ordinary  differential  equations 


(ODE). 

A  general  PDE  for  a  function  u  has  the  form 


(1.1) 


The  order  of  this  equation  is  m,  the  order  of  the  highest  derivative  appearing  (which 
is  assumed  to  be  finite).  A  classical  solution  u  admits  continuous  partial  derivatives 
up  to  order  m  and  satisfies  (1.1)  at  all  points  x  in  its  domain.  In  certain  situations  the 


differentiability  requirements  can  be  relaxed,  allowing  us  to  define  weak  solutions 
that  do  not  solve  the  equation  literally. 

A  somewhat  subtle  aspect  of  the  definition  (1.1)  is  the  fact  that  the  equation 
is  required  to  be  local.  This  means  that  functions  and  derivatives  appearing  in  the 
equation  are  all  evaluated  at  the  same  point. 

Although  classical  physics  provided  the  original  impetus  for  the  development 
of  PDE  theory,  PDE  models  have  since  played  a  crucial  role  in  many  other  fields, 
including  engineering,  chemistry,  biology,  ecology,  medicine,  and  finance.  Many 
industrial  applications  of  mathematics  are  based  on  the  numerical  analysis  of  PDE. 

Most  PDE  are  not  solvable  in  the  explicit  sense  that  a  simple  calculus  problem  can 
be  solved.  That  is,  we  typically  cannot  obtain  a  exact  formula  for  u(x).  Therefore 
much  of  the  analysis  of  PDE  is  focused  on  drawing  meaningful  conclusions  from  an 
equation  without  actually  writing  down  a  solution. 

©  Springer  International  Publishing  AG  2016  1 

D.  Borthwick,  Introduction  to  Partial  Differential  Equations , 

Universitext,  DOI  10.1007/978-3-319-48936-0_l 


2 


1  Introduction 


1.2  Example:  d’Alembert’s  Wave  Equation 


One  of  the  earliest  and  most  influential  PDE  models  was  the  wave  equation ,  devel¬ 
oped  by  Jean  d’Alembert  in  1746  to  describe  the  motion  of  a  vibrating  string.  With 
physical  constants  normalized  to  1 ,  the  equation  reads 


(1.2) 


where  u(t,  x)  denotes  the  vertical  displacement  of  the  string  at  position  v  and  time 
t.  If  the  string  has  length  i  and  is  attached  at  both  ends,  then  we  also  require  that 
u(t,  0)  =  u(t,l)  =  0  for  all  t.  We  will  discuss  the  formulation  of  this  model  in 
Sect.  4.1. 

D’Alembert  also  found  a  general  formula  for  the  solution  of  (1.2),  based  on  the 
observation  that  (1.2)  is  solved  by  any  function  of  the  form  f(x  ±  t ),  assuming  /  is 
twice-differentiable.  Given  two  such  functions  on  R,  we  can  write  a  general  solution 


u(t,  x)  :=  fi(x  +  t)  +  fiix  -  t). 


(1.3) 


A  similar  formula  applies  in  the  case  of  a  string  with  fixed  ends.  If  /  is  2t -periodic 
on  R,  meaning  f(x  +  21)  =  f(x)  for  all  x,  then  it  is  easy  to  check  that 

u(t,  x)  :=  ^  [/(*  +  t)  -  fit  -  x)]  (1.4) 

satisfies  u(t,  0)  =  u(t,l)  =  0  for  any  t. 

One  curious  feature  of  this  formula  is  that  it  appears  to  give  a  sensible  solution 
even  in  cases  where  /  is  not  differentiable.  For  example,  to  model  a  plucked  string 
we  might  take  the  initial  displacement  to  be  a  simple  piecewise  linear  function  in  the 
form  of  a  triangle  from  the  fixed  endpoints,  as  shown  in  Fig.  1.1. 

If  we  extend  this  to  an  odd,  2t -periodic  function  on  R,  then  the  formula  (1.4)  yields 
the  result  illustrated  in  Fig.  1.2.  The  initial  kink  splits  into  two  kinks  which  travel  in 
opposite  directions  on  the  string  and  and  appear  to  rebound  from  the  fixed  ends. 

This  is  not  a  classical  solution  because  u  is  not  differentiable  at  the  kinks.  However, 
u  does  satisfy  the  requirements  for  a  weak  solution,  as  we  will  see  in  Chap.  10. 

Although  a  physical  string  could  not  exhibit  sharp  corners  without  breaking,  the 
piecewise  linear  solutions  are  nevertheless  physically  reasonable.  Direct  observations 
of  plucked  and  bowed  strings  were  first  made  in  the  late  19th  century  by  Hermann 
von  Helmholtz,  who  saw  patterns  of  oscillation  quite  similar  to  what  is  shown  in 
Fig.  1.2.  The  appearance  of  kinks  propagating  along  the  string  is  striking,  although 
the  corners  are  not  exactly  sharp. 


Fig.  1.1  Initial  state  of  a 
plucked  string 
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Fig.  1.2  Evolution  of  the 
plucked  string,  starting  from 
t  =  0  at  the  top 


1.3  Types  of  Equations 

There  is  no  general  theory  of  PDE  that  allows  us  to  analyze  all  equations  of  the  form 
(1.1).  To  make  progress  it  is  necessary  to  restrict  our  attention  to  certain  classes  of 
equations  and  develop  methods  appropriate  to  those. 

The  most  fundamental  distinction  between  PDE  is  the  property  of  linearity.  A 
PDE  is  called  linear  if  it  can  be  written  in  the  form 


Lu  =  /, 


(1.5) 


where  /  is  some  function  independent  of  u ,  and  L  is  a  differential  operator.  Many  of 
the  important  classical  PDE  that  we  will  discuss  in  this  book  are  linear  and  of  first 
or  second  order.  For  such  cases  L  has  the  general  form 


ij= 1 


dxi  3  Xj 


H~  c, 


(1.6) 
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where  the  coefficients  ,  bj ,  and  c  are  functions  of  x.  The  second-order  coefficients 


are  assumed  to  be  symmetric,  aij  =  aji ,  because  the  mixed  partials  derivatives  of  a 
twice  continuously  differentiable  function  commute. 

Linearity  implies  that  a  linear  combination  of  solutions  is  still  a  solution,  a  fact  that 
is  referred  to  as  the  superposition  principle.  Superposition  often  lets  us  decompose 
problems  into  simpler  components,  which  is  the  main  reason  that  linear  problems 
are  much  easier  to  handle  than  nonlinear.  It  also  makes  it  possible  to  work  with 
complex-valued  solutions,  which  is  sometimes  more  convenient,  because  the  real 
and  imaginary  parts  of  a  complex  solution  will  solve  the  equation  independently. 

Most  linear  PDE  are  derived  as  approximations  to  more  realistic,  nonlinear  mod¬ 
els.  We  will  focus  primarily  on  the  linear  case  in  this  book.  The  main  reason  for  this 
is  that  nonlinear  PDE  are  inherently  more  complicated,  and  for  an  introduction  it 
makes  sense  to  start  with  the  more  basic  theory.  Furthermore,  the  analysis  of  nonlin¬ 
ear  problems  frequently  involves  the  study  of  associated  linear  approximations,  so 
that  one  must  understand  at  least  some  of  the  linear  theory  first. 

Linear  equations  are  further  classified  by  the  properties  of  the  terms  with  the 
highest  orders  of  derivatives,  since  this  determines  many  qualitative  properties  of 
solutions.  Elliptic  equations  of  second  order  are  associated  to  an  operator  L  of  the 
form  (1.6),  such  that  the  eigenvalues  of  the  symmetric  matrix  [atj  ]  are  strictly  positive 
at  each  point  in  the  domain.  The  prototype  of  an  elliptic  operator  is  L  =  —  A  where 
A  denotes  the  Laplacian , 


a2  a2 

- 1_  .  .  .  _| - 

dxf  dx%  ’ 


(1.7) 


named  after  the  mathematician  and  physicist  Pierre- Simon  Laplace. 

Equations  that  include  time  as  an  independent  variable  are  called  evolution  equa¬ 
tions.  The  time  variable  usually  plays  a  very  different  role  from  the  spatial  variables, 
so  in  such  cases  we  adapt  the  form  (1.6)  by  separating  out  the  time  derivatives 
explicitly. 

The  two  classic  types  of  second-order  evolution  equations  are  hyperbolic  and 
parabolic.  Hyperbolic  equations  are  exemplified  by  d’Alembert’s  wave  equation 
(1.2).  The  general  form  is  (1.5)  with 


+  (lower  order  terms) , 


(1.8) 


where  once  again  [a^]  is  assumed  to  be  a  strictly  positive  matrix.  Hyperbolic  equa¬ 
tions  are  used  to  model  oscillatory  phenomena. 

Parabolic  evolution  equations  have  the  form  (1.5)  with 


du 

L  =  — 
dt 


+  (lower  order  terms) , 


(1.9) 
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where  [a/7  ]  is  a  strictly  positive  matrix.  The  heat  equation,  whose  derivation  we  will 
discuss  in  detail  in  Sect.  6.1,  is  the  prototype  for  this  type  of  equation.  Parabolic 
equations  are  generally  used  to  model  phenomena  of  conduction  and  diffusion. 

Note  that  hyperbolic  and  parabolic  equations  revert  to  elliptic  equations  in  the 
spatial  variables  if  the  solution  is  independent  of  time.  Elliptic  equations  thus  serve 
to  model  the  equilibrium  states  of  evolution  equations. 

Because  of  their  association  with  phenomenological  properties  of  a  system,  the 
terms  “elliptic”,  “hyperbolic”,  and  “parabolic”  are  frequently  applied  more  broadly 
than  this  simple  classification  would  suggest.  A  nonlinear  equation  is  typically 
described  by  the  category  of  its  linear  approximations,  which  can  change  depending 
on  the  conditions. 

For  problems  on  a  bounded  domain,  the  application  usually  dictates  some  restric¬ 
tion  on  the  solutions  at  the  boundary.  Two  very  common  types  are  Dirichlet  bound¬ 
ary  conditions ,  specifying  the  values  of  u  at  the  boundary,  and  Neumann  conditions , 
specifying  the  normal  derivatives  of  u  at  the  boundary.  These  conditions  are  named 
for  Gustave  Lejeune  Dirichlet  and  Carl  Neumann,  respectively.  By  default  we  will 
use  these  terms  in  the  homogeneous  sense,  meaning  that  the  boundary  values  of  the 
function  or  derivative  are  set  equal  to  zero.  For  evolution  equations,  we  also  impose 
initial  conditions ,  specifying  the  values  of  u  and  possibly  its  time  derivatives  at  some 
initial  time. 


1.4  Well  Posed  Problems 

The  set  of  functions  used  to  formulate  a  PDE,  which  might  include  coefficients  or 
terms  in  the  equation  itself  as  well  as  boundary  and  initial  conditions,  is  collectively 
referred  to  as  the  input  data.  The  most  basic  question  for  any  PDE  is  whether  a 
solution  exists  for  a  given  set  of  data.  However,  for  most  purposes  we  want  to  require 
something  more.  A  PDE  problem  is  said  to  be  well  posed  if,  for  a  given  set  of  data: 

1.  A  solution  exists. 

2.  The  solution  is  uniquely  determined  by  the  data. 

3.  The  solution  depends  continuously  on  the  data. 

These  criteria  were  formulated  by  Jacques  Hadamard  in  1 902.  The  first  two  properties 
hold  for  ODE  under  rather  general  assumptions,  but  not  necessarily  for  PDE.  It  is 
easy  to  find  nonlinear  equations  that  admit  no  solutions,  and  even  in  the  linear  case 
there  is  no  guarantee. 

The  third  condition,  continuous  dependence  on  the  input  data,  is  sometimes  called 
stability.  One  practical  justification  for  this  requirement  is  it  is  not  possible  to  specify 
input  data  with  absolute  accuracy.  Stability  implies  that  the  effects  of  small  variations 
in  the  data  can  be  controlled. 

For  certain  PDE,  especially  the  classical  linear  cases,  we  have  a  good  under¬ 
standing  of  the  requirements  for  well-posedness.  For  other  important  problems,  for 
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1  Introduction 


Fig.  1.3  Numerical  simulations  of  blood  flow  in  the  aorta.  Courtesy  of  D.  Gupta,  Emory  University 
Hospital,  and  T.  Passerini,  M.  Piccinelli  and  A.  Veneziani,  Emory  Mathematics  and  Computer 
Science 


example  in  fluid  mechanics,  well-posedness  remains  a  difficult  unsolved  conjecture. 
Furthermore,  many  interesting  problems  are  known  not  to  be  well  posed.  For  exam¬ 
ple,  problems  in  image  processing  are  frequently  ill  posed,  because  information  is 
lost  due  to  noise  or  technological  limitations. 


1.5  Approaches 

We  can  organize  the  methods  for  handling  PDE  problems  according  to  three  basic 
goals: 

1.  Solving :  finding  explicit  formulas  for  solutions. 

2.  Analysis :  understanding  general  properties  of  solutions. 

3.  Approximation :  calculating  solutions  numerically. 

Solving  PDE  is  certainly  worth  understanding  in  those  special  cases  where  it  is 
possible.  The  solution  formulas  available  for  certain  classical  PDE  provide  insight 
that  is  important  to  the  development  of  the  theory. 

The  goals  of  theoretical  analysis  of  PDE  are  extremely  broad.  We  wish  to  learn 
as  much  as  we  can  about  the  qualitative  and  quantitative  properties  of  solutions  and 
their  relationship  to  the  input  data. 

Finally,  numerical  computation  is  the  primary  means  by  which  applications  of 
PDE  are  carried  out.  Computational  methods  rely  on  a  foundation  of  theoretical 
analysis,  but  also  bring  up  new  considerations  such  as  efficiency  of  calculation. 

Example  1.1  Figure  1.3  shows  a  set  of  numerical  simulations  modeling  the  insertion 
in  the  aorta  of  a  pipe-like  device  designed  to  improve  blood  flow.  The  leftmost  frame 
shows  the  aorta  before  surgery,  and  the  three  panes  on  the  right  model  the  insertion 
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at  different  locations.  The  PDE  model  is  a  complex  set  of  fluid  equations  called  the 
Navier-Stokes  equations.  These  fluid  equations  are  famously  difficult  to  analyze  and 
an  exact  solution  is  almost  never  possible.  However,  the  cylinder  is  one  case  that  can 
be  handled  explicitly.  For  the  numerical  simulations,  exact  solutions  for  a  cylindrical 
pipe  were  used  to  provide  boundary  data  at  the  point  where  the  pipe  meets  the  aorta. 

Theoretical  analysis  also  plays  an  important  role  here,  in  that  the  regularity  theory 
for  the  fluid  equations  is  used  to  predict  the  accuracy  of  the  simulation.  (The  complete 
well-posedness  analysis  of  the  Navier-Stokes  equations  remains  a  famously  unsolved 
problem,  however.) 

The  simulated  flows  displayed  in  Fig.  F3  were  computed  numerically  by  a  tech¬ 
nique  called  the  finite  element  method.  This  involves  discretizing  the  problem  to 
reduce  the  PDE  to  a  system  of  linear  algebraic  equations.  Modeling  a  single  heart¬ 
beat  in  this  simulation  require  solving  a  linear  system  of  about  500  million  equations. 


Chapter  2 

Preliminaries 


In  this  chapter  we  set  the  stage  for  the  study  of  PDE  with  a  review  of  some  core 
background  material. 


2.1  Real  Numbers 

The  real  number  system  R  is  constructed  as  the  “completion”  of  the  field  of  rational 
numbers.  This  means  that  in  addition  to  the  algebraic  axioms  for  addition  and  mul¬ 
tiplication,  R  satisfies  an  additional  axiom  related  to  the  existence  of  limits.  To  state 
this  axiom  we  use  the  concept  of  the  supremum  (or  “least  upper  bound”)  of  subset 
A  c  R.  The  supremum  is  a  number  sup(A)  e  R  such  that  (1)  all  elements  of  A  are 
less  than  or  equal  to  sup(A) ;  and  (2)  no  number  strictly  less  than  sup(A)  has  this  prop¬ 
erty.  The  completeness  axiom  says  that  every  nonempty  subset  of  R  that  is  bounded 
above  has  a  supremum.  An  equivalent  statement  is  that  a  nonempty  subset  that  is 
bounded  below  has  an  infimum  (“greatest  lower  bound”),  which  is  denoted  inf  (A). 

It  is  convenient  to  extend  these  definitions  to  unbounded  sets  by  defining  sup  (A)  := 
oo  when  A  is  not  bounded  above,  and  inf  (A)  :=  —  oo  when  the  set  is  not  bounded 
below.  We  also  set  sup(0)  =  — oo  and  inf(0)  :=  +oo.  With  these  extensions,  sup 
and  inf  are  defined  for  all  subsets  of  R. 

To  illustrate  the  definition,  we  present  a  simple  result  that  will  prove  useful  in  the 
construction  of  approximating  sequences  for  solutions  of  PDE. 

Lemma  2.1  For  a  nonempty  set  A  cl  there  exists  a  sequence  of  points  e  A 

such  that 

lim  Xk  =  sup  A , 

k — >  oo 


and  similarly  for  inf  A. 


The  original  version  of  the  book  was  revised:  Belated  corrections  from  author  have  been  incorpo¬ 
rated.  The  erratum  to  the  book  is  available  at  https://doi.org/10.1007/978-3-319-48936-0_14 


©  Springer  International  Publishing  AG  2016 
D.  Borthwick,  Introduction  to  Partial  Differential  Equations , 
Universitext,  DOI  10.1007/978-3-319-48936-0_2 
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Proof  If  A  is  not  bounded  above  then  there  exists  a  sequence  of  Xk  e  A  with  Xk  — >  oo. 
Therefore  the  claim  holds  when  sup  A  =  oo. 

Now  suppose  that  sup  A  =  a  e  M.  By  the  definition  of  the  supremum,  a  —  l/k 
is  not  an  upper  bound  of  A  for  k  e  N.  Therefore,  for  each  k  there  exists  such 

that  a  —  l/k  <  Xk  <  a.  This  yields  a  sequence  such  that  Xk  — >  a.  □ 

There  is  an  important  distinction  between  supremum  and  infimum  and  the  related 
concepts  of  maximum  and  minimum.  The  latter  are  required  to  be  elements  of  the  set 
and  thus  may  not  exist.  For  example,  the  interval  (0,  1)  has  sup  =  1  and  inf  =  0,  but 
has  neither  max  nor  min. 


2.2  Complex  Numbers 

The  complex  number  system  C  consists  of  numbers  of  the  form  z  =  x  +  iy,  where 
x,  y  e  M  and  i2  :=  —  1 .  The  numbers  x  and  y  are  called  the  real  and  imaginary  parts 
of  z.  The  conjugate  of  z  is 

z  :=  x  -  iy , 


so  that 


Re  z  := 


z  +  z 

2  ’ 


Im  z 


z  -  z 

2  i 


A  nonzero  complex  number  has  a  multiplicative  inverse,  given  by 


1 


x-iy 


x-\-iy  x2  +  y2' 

The  absolute  value  on  C  is  the  vector  absolute  value  from  Euclidean  Mr. 

\z\  :=  x2  +  y2. 

This  can  be  written  in  terms  of  conjugation, 


=  \fzz, 


which  shows  in  particular  that  the  absolute  value  is  multiplicative, 


zw\  :=  \z\\w 


for  z,  w  e  C. 

The  basic  theory  of  sequences  and  series  carries  over  from  M  to  C  with  only  minor 
changes.  A  sequence  {zk}  in  C  converges  to  z  if 


lim  | Zk  ~  z\  =  0, 
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and  a  series  J2ak  converges  if  the  sequence  of  partial  sums  - 1  ak  is  convergent. 
The  series  converges  absolutely  if 


r.  Wk\  <  oo. 
&=1 


It  follows  from  the  completeness  axiom  of  R  that  absolute  convergence  of  a  series 
in  C  implies  convergence. 

The  exponential  series, 


(2.1) 


converges  absolutely  for  all  z  c  C.  The  special  case  where  z  is  purely  imaginary 
gives  an  important  relation  called  Euler’ s  formula: 


e 


w 


+  i 


=  cos  6  +  i  sin  6 . 


(2.2) 


Leonhard  Euler,  arguably  the  most  influential  mathematician  of  the  18th  century, 
published  this  identity  in  1748.  It  yields  a  natural  polar-coordinate  representation  of 
complex  numbers, 


where  r  =  \z\  and  6  is  the  angle  between  z  and  the  positive  real  axis. 
The  product  rule  for  complex  exponentials, 


^  =  ez+w 


(2.3) 


follows  from  the  power  series  definition  just  as  in  the  real  case.  In  combination  with 
(2.2)  this  allows  for  a  very  convenient  manipulation  of  trigonometric  functions.  For 
example,  setting  z  =  lot  and  w  =  ip  in  (2.3)  and  taking  the  real  and  imaginary  parts 
recovers  the  identities 

cos^  +  P)  =  cos  a  cos  P  —  sin  a  sin  p , 
sin  (o'  +  P)  =  cos  a  sin  p  -\-  sin  a  cos  p. 

The  calculus  rules  for  differentiating  and  integrating  exponentials  are  derived 
from  the  power  series  expansion,  and  thus  extend  to  the  complex  case.  In  particular, 


d 

dx 


eax  —  aeax 


for  a  e  C. 
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2.3  Domains  in  R” 

For  points  in  W1  we  will  use  the  vector  notation  x  =  (x\,  . . . ,  xn).  The  Euclidean 
dot  product  is  denoted  x  •  y,  and  the  Euclidean  length  of  a  vector  is  written 

|x|  :=  y/x  •  x. 

The  Euclidean  distance  is  used  to  define  limits:  lim^oo  x^  =  w  means  that 

lim  \xk  —  w\  =  0. 

k — >oo 

The  ball  of  radius  R  >  0  centered  at  a  point  xo  G  R”  is 

B(xo;  R )  :=  {x  g  R”;  |x  —  xol  <  R} . 

A  small  ball  centered  at  Xo  is  called  a  neighborhood  of  xo.  If  x  G  A  has  a  neighbor¬ 
hood  contained  in  A  then  x  is  called  an  interior  point. 

A  subset  U  C  R”  is  open  if  all  of  its  points  are  interior.  This  generalizes  the  notion 
of  an  open  interval  in  one  dimension.  The  ball  Z?(x o;  R)  is  open,  for  example,  as  is 
R”  itself.  The  empty  set  is  open  by  default. 

A  boundary  point  of  A  C  M77  is  a  point  x  G  M77  such  that  every  neighborhood  of  x 
intersects  both  A  and  its  complement.  The  distinction  between  interior  and  boundary 
points  is  illustrated  in  Fig.  2. 1 .  Note  that  boundary  points  may  or  may  not  be  included 
in  the  set  itself.  The  boundary  of  A  is  denoted 

dA  :=  {boundary  points  of  A} . 

For  example,  the  boundary  of  the  ball  Z?(x o;  R)  is  the  sphere 

dB(x o;  R)  =  {x  g  M77;  |x  —  xol  =  R} . 

A  set  is  open  if  and  only  if  it  contains  no  boundary  points. 
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A  subset  of  M77  is  connected  if  any  two  points  in  the  set  can  be  joined  by  a 
continuous  path  within  the  set.  For  an  open  set  U  this  is  equivalent  to  the  condition 
that  U  cannot  be  written  as  the  disjoint  union  of  two  nonempty  open  sets. 

We  will  refer  to  a  connected  open  subset  F2  C  M77  as  a  domain ,  and  reserve  the 
notation  Q  for  this  usage.  For  some  problems  we  assume  the  domain  is  bounded , 
meaning  that  Q  C  #(0;  R)  for  sufficiently  large  R. 

The  concept  of  a  closed  interval  can  also  be  generalized  to  higher  dimension.  A 
subset  F  C  M77  is  closed  if  it  contains  all  of  its  boundary  points,  i.e., 

dF  C  F. 

The  union  of  a  subset  A  C  M77  with  its  boundary  is  called  the  closure  of  A  and 
denoted 

A:=  A  U  dA. 


For  example, 

B(x o;  R)  :=  { x  e  M77;  \x  —  jco|  <  . 

It  is  potentially  confusing  that  an  overline  is  used  for  set  closure  and  complex  con¬ 
jugation,  but  these  notations  are  standard.  Note  that  closure  applies  only  to  sets  and 
not  to  numbers  or  functions. 

A  closed  set  F  e  M77  contains  the  limits  of  all  sequences  in  F  that  converge  in 
M77 .  This  is  because  the  limit  of  a  sequence  contained  in  a  set  must  either  be  a  point 
in  the  set  or  a  boundary  point. 

Closed  and  open  sets  are  related  in  the  sense  that  the  complement  of  an  open  set 
is  closed,  and  vice  versa.  However,  the  terms  are  not  mutually  exclusive,  and  a  set 
might  not  have  either  property.  The  interval  (a,  b]  C  R  is  neither  open  nor  closed, 
for  example.  The  only  subsets  of  W1  with  both  properties  are  M77  itself  and  0. 


2.4  Differentiability 

The  space  of  continuous,  complex- valued  functions  on  a  domain  Q  C  M77  which 
admit  continuous  partial  derivatives  up  to  order  m  is  denoted  by  C777  (£?) .  The  assump¬ 
tion  of  continuity  for  derivatives  insures  that  mixed  partials  are  independent  of  the 
order  of  differentiation.  A  smooth  function  has  continuous  derivatives  to  all  orders; 
the  corresponding  space  is  written  C°°(£?). 

We  use  the  notation  Cm(£2;  R)  to  specify  real- valued  functions,  and  similarly 
C777  (Q\  M77)  denotes  the  space  of  vector- valued  functions.  It  is  common  to  use  C777  as 
an  adjective,  short  for  “m- times  continuously  differentiable”. 

The  definition  of  Cm(F2)  makes  no  conditions  on  the  behavior  of  functions  as 
the  boundary  is  approached.  To  impose  such  restrictions,  we  use  the  notation  C777  (Q) 
to  denote  the  space  of  functions  that  admit  C777  extensions  across  the  boundary.  For 
example,  the  function  *Jx  e  C°°( 0,  1)  is  an  element  of  C°[0,  1],  but  not  C^O,  1]. 
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The  support  of  /  e  C°(C2)  is  defined  as 


supp  /  :=  {x  e  Q\  f(x)  ^  0}.  (2.4) 


Note  that  the  definition  includes  a  closure.  This  means  that  the  support  does  not 
exclude  points  where  the  function  merely  “passes  through”  zero.  For  example,  the 
support  of  sin(*)  is  R  rather  than  R\ttZ. 

A  closed  and  bounded  subset  of  M77  is  said  to  be  compact.  We  denote  by  C777  t(^2) 
the  space  of  functions  on  £2  that  have  compact  support ,  meaning  that  supp  /  is  a 
compact  subset  of  Q .  Since  Q  is  open  and  the  support  is  closed,  this  requires  in 
particular  that  supp  /  be  a  strict  subset  of  Q.  For  example,  l  —  x2  vanishes  at  the 
boundary  of  (—1,  1),  but  does  not  have  compact  support  in  this  domain  because  its 
support  is  [—1,  1]. 


Example  2.2  To  demonstrate  the  existence  of  compactly  supported  smooth  func¬ 
tions,  consider 


h(x)  = 


g-i/a-* 2)? 


x\  <  1, 
x\  >  1, 


which  has  support  [—  1 ,  1].  As  illustrated  in  Fig.  2.2,  the  function  becomes  extremely 
flat  as  v  — >►  zb  1 . 

To  show  that  h  is  in  fact  smooth,  we  note  that 


h(m)(x) 


qm(x)  — 1/(1  — U) 

1*1  <  1, 

( l-x2)mC 

0, 

IV 

i— 1 

where  qm  denotes  a  polynomial  of  degree  m.  As  x  — >  ±1,  the  term  (1  —x2)~"'  blows 
up  while  the  exponential  term  tends  rapidly  to  zero.  Using  THopital’s  rule  one  can 
check  that  the  exponential  dominates  this  limit,  so  that  all  derivatives  of  h  vanish  as 
v  — >  ±1  from  |;t|  <  1.  This  shows  that  h  e  C^t(R). 

The  function  h  can  be  integrated  to  produce  a  smooth  function  that  is  constant  for 
|*|  >  1.  By  translating  and  rescaling,  this  construction  gives,  for  a  <  b,  a  function 
(p  e  C°°(M)  satisfying 


<p{x)  = 


x  <  a, 
x  >  b. 


Fig.  2.2  Compactly 
supported  smooth  function 
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These  ‘‘smooth  step  functions”  are  useful  as  building  blocks  for  pasting  together 
smooth  functions  on  different  domains.  0 

Certain  problems  require  regularity  of  the  boundary  of  £?.  For  example,  many 
theorems  in  vector  calculus  require  the  existence  of  a  normal  vector  on  the  boundary, 
which  does  not  exist  for  a  general  domain.  A  standard  hypothesis  for  such  theorems 
is  that  3,0  is  piecewise  C1 .  This  means  that  30  consists  of  a  finite  number  of  compo¬ 
nents  which  admit  regular  coordinate parametrization.  A  coordinate  parametrization 
is  a  map  a  e  C1  (U;  R")  where  U  is  a  domain  in  l .  To  say  the  parametrization  is 
regular  means  that  the  tangent  vectors  defined  by 

da  ( doi  dan 

dwj  \ 3 Wj  ’  ’3 Wj 

j  =  1 ,  ,n  —  1,  are  linearly  independent  at  each  point  of  312. 

The  piecewise  C1  boundary  assumption  guarantees  that  the  points  of  each  bound¬ 
ary  component  have  well-defined  tangent  spaces  and  normal  directions.  As  an  exam¬ 
ple,  the  unit  cube  in  R3  has  piecewise  C1  boundary,  consisting  of  6  planar  components 
with  normal  directions  parallel  to  the  coordinate  axes. 

In  this  text  we  will  focus  on  relatively  simple  domains  with  straightforward  bound¬ 
ary  parametrizations. 


2.5  Ordinary  Differential  Equations 


Our  development  of  PDE  theory  will  not  rely  on  any  advanced  techniques  for  the 
solution  of  ODE,  but  it  will  be  useful  to  recall  some  basic  material. 

First-order  ODE  can  often  be  solved  directly  by  methods  from  calculus.  The 
easiest  cases  are  equations  of  the  form 


dy 

dt 


=  g(y)h(t ), 


where  the  variables  can  be  separated  to  yield  an  integral  formula 


dt 

WY 


Integrating  both  sides  yields  a  family  of  solutions  with  one  undetermined  constant. 
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Example  2.3  Consider  the  equation 

dy 


dt 


=  ay,  y(  0)  =  y0. 


for  a  7^  0.  This  is  called  the  growth  or  decay  equation,  depending  on  the  sign  of  a . 
Separating  the  variables  gives 


/ 


dy_ 

y 


=  J  a  dt, 


which  integrates  to  In  y  =  at  +  C.  Solving  for  y  and  using  the  initial  condition  gives 

2/(0  =  2/oeaf. 

0 

Higher-order  ODE  are  generally  analyzed  by  reducing  to  a  system  of  first-order 
equations.  To  reduce  the  nth-order  equation 

y(n\t)  =  F  (t,  y,  y', . . . ,  y^l)) , 


we  define  the  vector- valued  function  w 
order  system 


>  y"'  'i,  (2.5) 

( y ,  if , . . . ,  This  satisfies  the  first- 


/  W\  \ 

/ 

w2  \ 

d 

• 

. 

dt 

Wn- 1 

Wn 

\w„) 


\F(t,  w)J 


First-order  systems  can  be  solved  generally  by  the  strategy  of  Picard  iteration, 

s  _ 

named  for  mathematician  Emile  Picard.  The  first  step  is  to  write  the  vector  equation, 


dw 

dt 


=  F(t,  w),  w(t0)  =  wo, 


(2.6) 


in  an  equivalent  form  as  a  recursive  integral  equation, 


w(t)  =  Wo+  /  F(s,w(s))ds. 


f 


For  the  construction,  we  set  Wq(0  :=  Wo  and  define  a  sequence  of  functions  by 


f 

Jt0 


Uk(t)  =  w0+  F(s,uk-i(s))ds 


(2.7) 


2.5  Ordinary  Differential  Equations 


17 


for  k  =  1,  2,  ....  It  can  be  shown  that  the  limit  of  this  sequence  exists  and  solves 
(2.6)  under  some  general  assumptions  on  F,  which  leads  to  a  proof  of  the  following 
result. 

Theorem  2.4  (Picard  iteration)  Suppose  that  F  is  a  continuous  function  on  I  x  Q 
where  I  is  an  open  interval  containing  to  and  Q  is  a  domain  in  containing  Wo,  and 
that  F  is  continuously  differentiable  with  respect  to  w.  Then  (2.6)  admits  a  unique 
solution  on  some  interval  (to  —  s,  to  +  e)  with  s  >  0. 

Applying  Theorem  2.4  to  (2.5)  shows  that  an  nth  order  ODE  satisfying  the  reg¬ 
ularity  assumptions  has  a  unique  local  solution  specified  by  the  initial  values  of  the 
function  and  its  first  n  —  1  derivatives. 

The  C1  hypothesis  on  F  is  stronger  than  necessary,  but  this  version  will  suffice  for 
our  purposes.  The  point  we  would  like  to  stress  here  is  the  relative  ease  with  which 
ODE  can  be  analyzed  under  very  general  conditions.  This  is  very  different  from  the 
PDE  theory,  where  no  such  general  results  are  possible. 

Example  2.5  The  harmonic  ODE  is  the  equation 


d2y 
dt 2 


for  k  >  0.  In  view  of  the  solution  to  the  growth/decay  equation  in  Example  2.3,  it  is 
reasonable  to  start  with  an  exponential  solution  as  a  guess.  Substituting  eat  into  the 
equation  yields  a2  =  —k2.  From  a  =  ±iK  we  obtain  the  general  solution, 


y(t)  =  c\elKt  +  c2e 


— ixt 


To  see  how  this  relates  to  the  Picard  iteration  method  described  above,  consider 
the  corresponding  system  (2.6)  for  w  =  (; y ,  y'): 


dw 

dt 


0  1 
—  K2  0 


w 


With  wq  =  (a,  b),  the  recursive  formula  (2.7)  yields  the  sequence  of  functions 


«•« =tj,c 

7  —0  J  ’  V 


tj  (  0  lV  (a 
k2  0 /  I b 


fork  e  N.  In  the  limit  k  — >  oo  this  gives 


w(t)  = 


i  _  (yQ2  ,  (^)4 


i 

H — 

K 


2! 


Kt  — 


4! 

Of?)3 

3! 


a 

b 


+ 


Octy 

5! 


b 


2 

—k  a 
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Reading  off  y  as  the  first  component  of  w  gives  the  familiar  trigonometric  solution, 

b 

y(t)  =  a  cos (Kt)  H —  sin (Kt). 

K 

The  trigonometric  and  complex  exponential  solutions  are  related  by  Euler’s  formula 

(2.2).  0 


2.6  Vector  Calculus 


The  classical  theorems  of  vector  calculus  were  motivated  by  PDE  problems  arising 
in  physics.  For  our  purposes  the  most  important  of  these  results  is  the  divergence 
theorem.  We  assume  that  the  reader  is  familiar  with  the  divergence  theorem  in  the 
context  of  M2  or  M3 .  In  this  section  we  will  cover  the  basic  definitions  needed  to  state 
the  result  in  Rn  and  develop  its  corollaries. 

As  noted  in  Sect.  2.3,  we  always  take  a  domain  Q  c  M77  to  be  connected  and 
open.  The  gradient  of  /  e  C[(f2)  is  the  vector- valued  function 


For  a  vector- valued  function  v  e  Cl(C2\  Mn)  with  components  (iq, . . . ,  vn),  the 
divergence  is 

dVi  dvn 

V  •  v  := - h  •  •  •  H - . 

dx\  dxn 

The  Laplacian  operator  introduced  in  (1.7)  is  the  divergence  of  the  gradient 


A u  :=  V  •  (Vw). 

For  this  reason  A  is  sometimes  written  V2 . 

If  £2  is  bounded  then  the  Riemannian  integral  of  /  e  C°  (£2)  exists  and  is  denoted 
by 

f  f(x)dnx, 

where  dnx  is  a  shorthand  for  dx\  •  •  •  dxn .  The  integral  can  be  extended  to  unbounded 
domains  if  the  appropriate  limits  exist.  We  will  discuss  a  further  generalization  of 
the  Riemann  definition  in  Chap.  7. 

One  issue  we  will  come  across  frequently  is  differentiation  under  the  integral. 
If  Q  C  M77  is  a  bounded  domain  and  u  and  du/dt  are  continuous  functions  on 
(a,  b)  x  Q,  then  the  Feibniz  integral  rule  says  that 


2.6  Vector  Calculus 
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d  f  „  f  du 

—  /  u(t,  x)  d  x  =  /  — (t,  x )  d  x. 

dt  Jq  Jv  3* 

Differentiation  under  the  integral  may  still  work  when  the  integrals  are  improper,  but 
this  requires  greater  care. 

To  set  up  boundary  integrals  for  a  domain  with  piecewise  C1  boundary,  we  need 
to  define  the  surface  integral  over  a  regular  coordinate  patch  a  :  U  C  M"-1  3 C2. 

Let  v  :  U  — >  R”  denote  the  unit  normal  vector  pointing  outwards  from  the  domain. 
The  surface  integral  for  such  a  patch  is  defined  by 


L 


fdS:=  [  f  (a(w)) 

(U)  Ju 


det 


da 


da 


?  #  *  *  ? 

dw\  dwn- 1 


,  v 


dn~lw, 


(2.8) 


where  det[. . .  ]  denotes  the  determinant  of  a  matrix  of  column  vectors.  The  full  surface 
integral  over  d£2  is  defined  by  summing  over  the  boundary  coordinate  patches.  For 
simplicity,  we  notate  this  as  a  single  integral, 


fdS. 


In  M2,  a  boundary  parametrization  will  be  a  curve  a(t)  and  (2.8)  reduces  to  the 
arclength  integral 


L 


f(or(t )) 


da 

dt 


dt. 


In  M3,  the  unit  normal  for  a  surface  patch  can  be  computed  from  the  cross  product 
of  the  tangent  vectors.  This  leads  to  the  surface  integral  formula 


L 


f  dS  :=  [  f  (a(w)) 

(U)  Ju 


da  da 

x 


dw\  dw  2 


d2w 


Even  in  low  dimensions  surface  integrals  can  be  rather  complicated.  We  will  make 
explicit  use  of  these  formulas  only  in  relatively  simple  cases,  such  as  rectangular 
regions  and  spheres. 

We  can  use  (2.8)  to  decompose  integrals  into  radial  and  spherical  components. 
This  is  particularly  useful  when  the  domain  is  a  ball.  Let  r  :=  |x|  be  the  radial 
coordinate,  and  define  the  unit  sphere 


S"-1  :=  {r  =  1}  c  R". 


A  point  x  7^  0  can  be  written  uniquely  as  roo  for  oo  e  Sn~l  and  r  >  0.  Let  oo(y)  be  a 
parametrization  of  S,?_1  by  coordinates  y  e  U  C  R*-1.  For  the  change  of  variables 
(r,  y)  i-^  x  =  roo(y ),  the  Jacobian  formula  gives 
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dnx  = 

det 

dx 

8x 

8x 

%  •  •  •  % 

[dyi 

dyn- i  ’ 

dr 

det 

dco 

8(0 

(O 

m  m  •  •  m 

.dyi 

dy„-i  ’ 

dr  dyY---dyn.x 


rn  1  dr  dyi  ■  ■  ■  dyn-\. 


(2.9) 


On  the  unit  sphere,  the  outward  unit  normal  v  is  equal  to  oo.  Thus  (2.9)  reduces  to 

dnx  =  rn~[  dr  dS(y). 


For  an  integral  over  the  ball  this  yields  the  radial  integral  formula, 


/, 


f{x )  dnx 


B(0;R) 


f(r(o(y))r”  ldr  dS(y). 


(2.10) 


With  these  definitions  in  place,  we  turn  to  the  divergence  theorem,  which  relates 
the  flux  of  a  vector  field  through  a  closed  surface  to  the  divergence  of  the  field  in  the 
interior.  This  result  is  generally  attributed  to  Carl  Friedrich  Gauss,  who  published  a 
version  in  1813  in  conjunction  with  his  work  on  electrostatics. 

Theorem  2.6  (Divergence  theorem)  Suppose  F2  C  R”  is  a  bounded  domain  with 
piecewise  C1  boundary.  For  a  vector  field  F  e  Cl(f2;  R"), 


V  •  Fdnx 


F • v dS , 


where  v  is  the  outward  unit  normal  to  d  F2. 

A  full  proof  can  be  found  in  advanced  calculus  texts.  To  illustrate  the  idea,  we 
will  show  how  the  argument  works  for  a  spherical  domain  in  M3 . 

Example  2.7  Let  B3  =  {r  <  1}  C  M3.  Because  a  vector  field  can  be  decomposed 
into  components,  it  suffices  to  consider  a  field  parallel  to  one  of  the  coordinate  axes, 
say  F  =  (0,  0,  /).  The  divergence  is  then 


V  •  F  = 


dxf 


In  cylindrical  coordinates,  x  =  (p  cos  0,  p  sin  6,  z),  the  volume  element  is 

o 

d  x  =  p  dp  d(j)  dz , 
so  the  left  side  of  the  divergence  formula  becomes 
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/ 

J  B; 


V  •  F  d5x  = 


*2  71  -p2 


pzn  /»!  />yi 

Jo  Jo  J-AI 


p  dz  dp  dO 


PI  I  [/  (p  cos  0 ,  p  sinP,  V7!  -  P2) 

—  f(^p  cos  0,  p  sin  0 ,  —  yj  1  —  pdpdO. 


(2.11) 


Note  that  z  =  zL-y/l  —  p2  gives  the  restriction  to  the  upper  and  lower  hemispheres, 
respectively. 

We  denote  the  two  hemispheres  c  §2  and  parametrize  them  as 


c o±(p ,  0)  =  ^p  cos  0,  p  sin 0 ,  z \zy/ 1  —  p2^j 


The  corresponding  surface  area  elements  are  given  by 


dS  = 


dco±  dco± 

dp  80 

P 


dp  dO 


dp  dO. 


Thus, 


n  /  (p  cos  0,  p  sin  0,  d=y  1  —  p2^  p  dp  dO  =  L  fy/l  -  p2  dS 


On  3]_  we  have  F  ■  v  =  if  pi  —  p2,  so  that 


[  fP~\ 

h\ 


-  p2dS  = 


-±[ 

Js2± 


F  •  v 


Applying  this  to  (2.1 1)  reduces  the  equation  to 


/ 

Jb: 


'x  =  J 


V  •  F<i3x  =  /  F  -vdS, 


verifying  the  divergence  theorem  in  this  special  case. 


0 


Theorem  2.6  can  be  used  to  evaluate  integrals  of  the  Laplacian  of  a  function  by 
substituting  F  =  Vw  for  the  vector  field.  Inside  the  volume  integral  this  yields  the 
integrand 

V  •  F  =  A u. 


On  the  surface  side,  the  integrand  becomes  the  directional  derivative  with  respect  to 
the  outward  unit  normal,  which  is  denoted 
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du 

dv 


:=  v  •  Vu 


d£2' 


Corollary  2.8  If  Q  C  R”  w  a  bounded  domain  with  piecewise  C1  boundary,  and 
u  G  C2{fl),  then 


Au  dnx 


du 

dv 


dS. 


The  application  we  will  encounter  most  frequently  is  to  the  ball  B( 0;  R )  G  Rn . 
The  outward  unit  normal  is  parallel  to  the  position  vector,  so  that 


v  = 


X 


It  follows  from  the  chain  rule  that 


du  du 


dv  dr 


(2.12) 


Example  2.9  Consider  a  radial  function  g(r)  where  r  :=  |x|  for  x  g  Rn .  For  the 
ball  Z?(0;  a),  the  radial  integral  formula  (2.10)  gives 


[  A g  dnx  =  An  f  A g(r)rn  1  dr, 
JB(0;a )  JO 


where 


An  :=  vo I (3"  1 ). 


(2.13) 


By  (2.12), 


l 


aB(0;a) 


dg 

dv 


dS  = 


l 


3B(0;a) 


dg 

dr 


(a)  dS 


-l  dg 


=  Ana  —(a), 
dr 


The  formula  from  Corollary  2.8  reduces  in  this  case  to 


pa 

Jo 


A  g(r)rn~ldr  =an~l—(a). 

dr 


(2.14) 


Differentiating  (2.14)  with  respect  to  a  gives,  by  the  fundamental  theorem  of 
calculus. 


an  1  Ag(a)  = 


d 

da 


an~l—(a) 


dr 


This  holds  for  all  a  >  0,  so  evidently  the  Laplacian  of  a  radial  function  is  given  by 
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A  g  =  r 


1  —n 


d 

dr 


.n— l 


d 

dr 


8- 


(2.15) 


In  principle  one  could  derive  this  formula  directly  from  the  chain  rule,  but  the  direct 
computation  is  difficult  in  high  dimensions.  0 

There  are  two  other  direct  corollaries  of  the  divergence  theorem  that  will  be  used 
frequently.  These  are  named  for  the  mathematical  physicist  George  Green,  who  used 
them  to  develop  solution  formulas  for  some  classical  PDE. 

The  first  result  is  a  generalization  of  Corollary  2.8,  obtained  from  Theorem  2.6 
by  the  substitution  F  =  vVu  for  a  pair  of  functions  u,  v.  The  product  rule  for 
differentiation  gives 

V  •  (rVw)  =  Vu  •  Wu  +  vAu,  (2.16) 


which  can  easily  be  checked  by  writing  out  the  components  of  the  gradient. 

Theorem  2.10  (Green’s  first  identity)  If  Q  C  R”  is  a  bounded  domain  with  piece- 
wise  C1  boundary,  then  for  u  e  C2(I2),  and  v  e  C 1  (Q), 

[Vu  •  Vu  +  vAu]  dnx  =  /  v — dS. 

JdQ 

The  second  identity  follows  from  the  first  by  interchanging  u  with  v  and  then 
subtracting  the  result. 

Theorem  2.11  (Green’s  second  identity)  If  Q  C  M77  is  a  bounded  domain  with 
piecewise  C1  boundary,  then  for  u,  v  e  C2(£2), 


uAv ]  dnx  = 


dv\ 

u —  dS. 
dv  ) 


2.7  Exercises 

2.1  For  r  :=  |x|  in  M77,  and  ctgR,  compute  V(r“)  and  A (ra). 

2.2  Polar  coordinates  (r,  0)  in  M2  are  related  to  Cartesian  coordinates  (x\ ,  xf)  by 

x\=rcos0,  X2  =  rsin0. 

(a)  Use  the  chain  rule  to  compute  ^  and  ^  in  terms  of  ^  and 

(b)  Find  the  expression  for  A  in  the  (r,  0)  coordinates.  (The  radial  part  should  agree 
with  (2.15).) 

2.3  In  M,?  let  Q  be  the  unit  cube  (0,  l)77.  Define 

«>0*0  =  f(x)ej, 
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where  /  e  C°°(f2)  and  ej  is  the  j th  coordinate  vector,  ej  :=  (0,  . . . ,  1,  . . . ,  0). 
Compute  both  sides  of  the  formula  from  Theorem  2.6  in  this  case,  and  show  that 
the  result  reduces  to  an  application  of  the  fundamental  theorem  of  calculus  in  the  Xj 
variable. 

2.4  For  /  €  C°(Rn)  set 

h(t)  :=  [  f{x )  dnx. 

for  t  >  0.  Use  the  radial  decomposition  formula  (2.10)  to  show  that 


dh 

dt 


L 


dB(0;t) 


f(w)  dS(w). 


2.5  The  gamma  function  is  defined  for  z  >  0  by 


r  (z) 


tz  le  r  dt. 


(2.17) 


Note  that  T(l)  =  1  and  integration  by  parts  gives  the  recursion  relation  T(z  +  1)  = 
zr(z).  In  this  problem  we  will  show  that  the  volume  of  the  unit  sphere  in  W1  is  given 
by 


A 


n 


(2.18) 


(a)  Use  the  radial  formula  (2.10)  and  the  substitution  u  :=  r2  to  compute  that 


(b)  Observe  that  we  can  rewrite 


(2.19) 


Substitute  t  =  x2  to  evaluate  the  one-dimensional  integral  in  terms  of  T(2). 

(c)  Compare  (a)  to  (b)  to  obtain  a  formula  for  An . 

(d)  Use  (c)  and  the  fact  that  A2  =  2n  to  compute  T(2)  and  reduce  the  formula  to 


(2.18). 


Chapter  3 

Conservation  Equations  and  Characteristics 


A  conservation  law  for  a  physical  system  states  that  a  certain  quantity  (e.g.,  mass, 
energy,  or  momentum)  is  independent  of  time.  For  continuous  systems  such  as  fluids 
or  gases,  these  global  quantities  can  be  defined  as  integrals  of  density  functions.  The 
conservation  law  then  translates  into  a  local  form,  as  a  PDE  for  the  density  function. 

In  this  section  we  will  study  some  first-order  PDE  that  arise  from  conservation 
laws.  We  introduce  a  classic  technique,  called  the  method  of  characteristics,  for 
analyzing  these  equations. 


3.1  Model  Problem:  Oxygen  in  the  Bloodstream 


To  derive  the  conservation  equation,  we  consider  a  simple  model  for  the  concentration 
of  oxygen  carried  by  the  bloodstream.  For  this  discussion  we  ignore  any  external 
effects  that  might  break  the  conservation  of  mass,  such  as  absorption  of  oxygen  into 
the  walls  of  a  blood  vessel.  (Some  examples  of  external  effects  will  be  considered  in 
the  exercises.) 

Let  us  model  an  artery  as  a  straight  tube,  as  pictured  in  Fig.  3.1.  We  assume  that  the 
concentration  is  constant  on  cross-sections  of  the  tube,  so  that  the  problem  reduces 
to  one  spatial  dimension.  For  the  moment,  suppose  that  the  artery  extends  along  the 
real  line  and  is  parametrized  by  x  e  M. 

Let  u(t,  x)  denote  the  oxygen  concentration,  expressed  in  units  of  mass  per  unit 
length.  Within  a  fixed  interval  [a,  b],  as  highlighted  in  Fig.  3.1,  the  total  mass  at  time 
t  is  given  by  an  integral, 


m(t) 


u(t,  x)  dx. 


(3.1) 


The  original  version  of  the  book  was  revised:  Belated  corrections  from  author  have  been  incorpo¬ 
rated.  The  erratum  to  the  book  is  available  at  https://doi.org/10.1007/978-3-319-48936-0_14 


©  Springer  International  Publishing  AG  2016 
D.  Borthwick,  Introduction  to  Partial  Differential  Equations , 
Universitext,  DOI  10.1007/978-3-319-48936-0_3 
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a  b 

Fig.  3.1  One-dimensional  model  of  an  artery 


x 


The  instantaneous  flow  rate  at  a  given  point  v  is  called  th tflux  q(t,  x),  expressed 
as  mass  per  unit  time.  The  general  relationship  between  flux  and  concentration  is 


flux  =  (concentration)  x  (velocity). 


For  the  bloodstream  model  we  can  reasonably  assume  that  velocity  is  independent 
of  the  oxygen  concentration  (because  oxygen  accounts  for  a  relatively  small  portion 
of  the  total  density).  This  assumption  implies  that  q  has  a  linear  dependence  on  u.  In 
other  models  the  velocity  might  depend  on  the  concentration,  making  q  a  nonlinear 
function  of  u. 

Conservation  of  mass  implies  that  the  total  amount  of  oxygen  within  the  segment 
changes  only  as  oxygen  flows  across  the  boundary  points  at  x  =  a  and  x  =  b.  Since 
the  flow  across  these  points  is  given  by  the  flux,  the  corresponding  equation  is 


dm 

dt 


it)  =  q(t,  a)  -  q(t ,  b). 


(3.2) 


If  q  is  continuously  differentiable  with  respect  to  position,  then  the  fundamental 
theorem  of  calculus  allows  us  to  write  the  right-hand  side  of  (3.2)  as  an  integral, 


q(t,  a)  -  q(t ,  b) 


dq 

dx 


dx. 


We  can  also  differentiate  the  integral  in  (3.1)  to  obtain 


dm 

dt 


du 

—  dx 
dt 


provided  that  u(t ,  x)  is  continuously  differentiable  with  respect  to  time.  These  cal¬ 
culations  transform  (3.2)  into  the  integral  equation 


(3.3) 


Since  the  segment  was  arbitrary,  (3.3)  should  hold  for  all  values  of  a,  b.  This  is 
only  possible  if  the  integrand  is  identically  zero,  which  gives  the  local  form  of  the 
law  conservation  of  mass: 


du  dq 
—  T  — 
dt  dx 


(3.4) 
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This  relationship  between  concentration  and  flux  is  called  the  continuity  equation 
(or  transport  equation).  The  continuity  equation  applies  generally  to  the  physical 
process  of  advection ,  which  refers  to  the  motion  of  particles  in  a  bulk  fluid  flow. 

To  adapt  (3.4)  to  a  particular  model,  we  need  to  specify  the  relationship  between 
q  and  u.  As  we  remarked  above,  for  the  bloodstream  model  it  is  reasonable  to  assume 
a  linear  relationship, 

q  =  vu ,  (3.5) 

where  the  velocity  v(t,x)  is  part  of  the  input  data  for  the  equation.  Under  this 
assumption  (3.4)  reduces  to 


du  du  dv 

—  H-  v  —  u  —  —  0, 

dt  dx  dx 

which  is  called  the  linear  conservation  equation. 


(3.6) 


3.2  Lagrangian  Derivative  and  Characteristics 


In  this  section  we  will  discuss  the  strategy  for  solving  a  first-order  PDE  such  as  (3.6). 
The  basic  idea  is  to  adopt  the  perspective  of  an  observer  traveling  with  velocity  v. 
This  is  like  taking  measurements  in  a  river  from  a  raft  drawn  by  the  current.  Once 
we  fix  a  starting  point  for  the  observer,  the  observed  concentration  depends  only  on 
the  time  variable,  thus  reducing  the  equation  to  an  ODE. 

This  principle  applies  to  any  first-order  PDE  of  the  form 


du 

Yt 


du 


+  w  =  0, 


(3.7) 


where  v  =  v(t,  x)  is  independent  of  u.  The  zeroth-order  term  w  could  be  a  general 
function  w(t,  x,  u).  A  trajectory  t  i->  x(t)  is  called  a  characteristic  for  the  equation 
(3.7)  if 

-r-(t)  =  v(t,x(t)).  (3.8) 

dt 

For  v  and  dv/dx  continuous,  Theorem  2.4  shows  that  a  unique  solution  exists  in  the 
neighborhood  of  each  starting  point  (U,  *o)- 

Example  3.1  Suppose  v(t,  x)  =  at  -\-  b,  with  a  and  b  constant.  Integration  over  t 
gives 

a  9 

x(t)  =  —t  +  bt  +  Vo. 

2 


The  characteristics  are  a  family  of  curves  indexed  by  the  parameter  xo,  as  illustrated 
in  Fig.  3.2.  0 
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Fig.  3.2  Sample 
characteristics  for  the 
velocity  v(t,  x)  =  1  +2 1 


From  the  point  of  view  of  an  observer  carried  by  the  flow,  the  measured  concen¬ 
tration  is  u(t,x(t)).  The  observed  rate  of  change  is  the  derivative  of  this  quantity, 


Du  d 

—  ( t )  :=  —u(t,x(t)), 
Dt  dt 


(3.9) 


called  the  Lagrangian  derivative  (or  material  derivative).  This  concept  was  devel¬ 
oped  by  the  18th  century  mathematician  and  physicist  Joseph-Louis  Lagrange.  Note 
that  Du/Dt  depends  also  on  the  initial  value  (to,  Vo)  that  determines  the  character¬ 
istic.  For  convenience  we  suppress  the  initial  point  from  the  notation. 

Theorem  3.2  On  each  characteristic,  (3.7)  reduces  to  the  ODE 


Du 

- \-  w  =  0,  (3.10) 

Dt 


where  w  is  the  restriction  of  w  to  the  characteristic, 


w(t)  :=  w(t,  x(t),  u(t ,  v(f)). 


In  particular,  if  w  =  0  then  u  is  constant  on  each  characteristic. 

Proof  Applying  the  chain  rule  in  (3.9)  gives 

Du  du  du  dx 
Dt  dt  +  dx  dt  ’ 

with  the  understanding  that  the  partial  derivatives  on  the  right  are  evaluated  at  the 
point  (t,  x(t)).  Because  x(t)  solves  (3.8),  this  reduces  to 

Du  du  du 


Dt  dt 


(3.11) 
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If  we  restrict  the  variables  in  (3.7)  to  (t,x(t)),  then  the  first  two  terms  match  the 
right-hand  side  of  (3.11),  reducing  the  equation  to  (3.10). 

If  w  =  0,  then  (3.10)  becomes 


Du 

- =0. 

Dt 

This  is  equivalent  to  the  statement  that  u(t,x(t))  is  independent  of  t.  □ 

With  Theorem  3.2  we  can  effectively  reduce  the  PDE  (3.7)  to  a  pair  of  ODE, 
namely  the  characteristic  equation  (3.8)  and  the  Lagrangian  derivative  equation 
(3.10).  In  many  cases,  solving  these  ODE  will  lead  to  an  explicit  formula  for  u(t,  x). 
This  approach  is  referred  to  as  the  method  of  characteristics. 

Example  3.3  For  constants  a,  b  e  R,  assume  that  u(t,x)  satisfies 


du 

~dt 


+  ( at  +  b) 


du 

dx 


with  the  initial  condition 


u(0,x)  =  g(x). 


for  some  function  g  e  C!(M).  The  characteristics  for  this  velocity,  v(t,x)  =  at  +  b, 
were  computed  in  Example  3.1. 

According  to  Theorem  3.2,  u  is  constant  along  characteristics,  implying  that 


u  (*,  f2  +  bt  +  x0j  =  m(0,  xq)  =  g(x0),  (3.12) 

for  all  t  e  R.  This  is  not  yet  a  formula  for  u(t,x),  but  we  can  derive  the  solution 
formula  by  identifying 


a  2 

x  =  —t  -\~  bt  +  Vo. 

2' 

Solving  for  vq  in  terms  of  x  and  substituting  this  into  (3.12)  gives 
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Example  3. 4  For  steady  flow  through  a  pipe  of  changing  diameter,  the  velocity  would 
vary  with  position  rather  than  time.  Let  v(t,  x)  =  a  +  bx  for  v  >0,  with  a,  b  >  0. 
The  resulting  characteristic  equation  (3.8)  is 

dx 

—  =  a  +  bx . 
dt 

This  can  be  solved  by  the  standard  ODE  technique  of  separating  the  t  and  v  variables 
to  different  sides  of  the  equation: 

dx 

- =  dt. 

a  +  bx 

Integration  of  both  sides  gives  the  general  solution 

1 

-  In  (a  +  bx)  =  t  +  C, 
b 

with  C  a  constant  of  integration.  (Note  that  a  +  bx  >  0  by  our  assumptions.)  Solving 
for  v  gives 

x(t)  =  -  \eb{t+C)  —  a\ . 
b 

Given  the  assumption  v  >  0,  it  is  natural  to  index  the  characteristics  by  the  start 
time  such  that  v(^o)  =  0.  With  this  convention,  the  family  of  solutions  is 

x(t)  =  -\eb(,-,o)  -  ll.  (3.13) 

b 

These  characteristic  curves  are  illustrated  in  Fig.  3.3. 

With  v  =  a  +  bx  the  linear  conservation  equation  (3.6)  becomes 

du  du 

- 1-  (a  +  bx) - h  bu  =  0. 

dt  dx 

Let  us  find  the  solution  under  the  boundary  condition 

n(f,0)  =  /(f).  (3.14) 


Since  dv/dx  =  b,  (3.10)  gives 


Du 

~Dt 


+  bu  =  0. 


This  is  a  decay  equation,  with  the  family  of  exponential  solutions 


u(t ,  x(t))  =  Ae  bt . 
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Fig.  3.3  Characteristic  lines 
for  the  position-dependent 
velocity  function  of 
Example  3.4 


X 


To  fix  A,  we  substitute  the  starting  point  (to,  0)  into  the  equation  and  obtain 


u(t,x(t))  =  f(t0)e  b(t  ,o) 


(3.15) 


Putting  together  (3.13)  and  (3.15)  and  applying  the  boundary  condition  (3.14) 
gives 


u 


( t ,  ^  [eb(t  to)  -  l])  =  f(t0)e 


-bit -to) 


(3.16) 


To  express  this  as  a  function  of  (t,  x),  we  set 


a 


X  =  -  \eb(t-,o)  -  ll , 

b  J 


and  solve  for  to  to  obtain 

1  (a 
=  t  +  -  In  ( 

b  \a  +  bx 

Substituting  this  expression  into  (3.16)  gives  the  final  form  of  the  solution: 


u(t,  x )  = 


a  \  (  1  (a 

-  /  It  +  -  In  - 

a  +  bx  J  V  b  \a-\-bx 


A  sample  solution  is  illustrated  in  Fig. 3.4  for  a  =  1,  b  =  For  this  example 
the  boundary  condition  f(t)  was  taken  to  have  support  between  t  =  —l  and  t  =  1, 
with  a  maximum  at  t  =  0.  The  plots  of  u(t,  x)  on  the  right  show  concentrations  at 
a  succession  of  times.  Mass  conservation  is  reflected  in  the  fact  that  the  total  area 
under  each  of  these  curves  is  independent  of  t.  0 


32 


3  Conservation  Equations  and  Characteristics 


10 


X 


-1 


2 


3 


4 


Fig.  3.4  Behavior  of  solutions  for  Example  3.4.  In  the  contour  plot  on  the  left,  darker  regions 
correspond  to  higher  concentration.  The  change  in  colors  corresponds  to  exponential  decay  along 
the  characteristics  illustrated  in  Fig.  3.3 

3.3  Higher-Dimensional  Equations 

For  flow  problems  in  more  than  one  spatial  dimension,  we  can  develop  a  continuity 
equation  analogous  to  (3.4)  by  the  same  reasoning  as  in  Sect.  3.1.  Suppose  u(t,  x) 
represents  a  concentration  defined  for  t  e  R  and  x  e  W1 .  Let  TZ  c  W1  be  a  bounded 
region  with  C1  boundary.  The  total  mass  within  this  region  is  given  by  the  volume 


integral 


The  flow  of  u  is  represented  by  a  vector- valued  flux  density  q(t,x).  The 
interpretation  of  the  flux  density  is  that  the  rate  at  which  mass  passes  through  an 


(n  —  1) -dimensional  surface  is  given  by  the  surface  integral  of  q  over  this  surface. 
In  particular,  the  rate  at  which  mass  exits  TZ  through  the  boundary  is  the  quantity 


/  v  •  q  dS, 

Jdn 


where  v  is  the  outward  unit  normal  vector  defined  on  97 Z. 

Conservation  of  mass  dictates  that  the  mass  within  7 Z  can  change  only  as  mass 
enters  or  leaves  through  the  boundary.  In  other  words, 


(3.17) 


Assuming  that  q  is  C1  with  respect  to  x ,  the  Divergence  Theorem  (Theorem  2.6) 
allows  us  to  rewrite  the  flux  integral  as 


(3.18) 
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Note  that  since  q  depends  on  both  t  and  x,  the  notation  V  •  q  is  slightly  ambiguous. 
We  follow  the  standard  convention  that  vector  calculus  operators  such  as  V  and  A 
act  only  on  spatial  variables. 

If  u  is  C1  with  respect  to  t ,  then  we  can  also  differentiate  the  integral  for  m  to 
obtain 


dm 

dt 


Combining  this  with  (3.17)  and  (3.18)  gives 


(3.19) 


As  in  the  one-dimensional  case,  we  now  observe  that  since  (3.19)  holds  for  an  arbi¬ 
trary  region  7Z,  the  integrand  must  vanish.  This  is  the  higher-dimensional  continuity 
equation: 


du 

~dt 


+  V  •  q 


(3.20) 


Suppose  we  make  the  linear  assumption  that  q  =  vu  for  a  velocity  field  v  which 
is  independent  of  u.  The  product  rule  for  the  divergence  of  a  vector  field  is 


V  •  (vu)  =  (V  •  v)u  +  v  •  Vw. 


Substituting  this  into  (3.20)  gives  the  higher-dimensional  form  of  the  linear  conser¬ 
vation  equation 

du 

—  T  v  •  V u  T  (V  •  v)u  —  0.  (3.21) 

dt 

In  the  special  case  where  V  •  v  =  0  the  velocity  field  is  called  solenoidal  (or 
divergence-free).  This  situation  arises  frequently  in  applications,  because  incom¬ 
pressible  fluids  like  blood  or  water  have  solenoidal  velocity  fields. 

The  method  of  characteristics  from  Sect.  3.2  can  be  adapted  directly  to  (3.21). 
Consider  a  somewhat  more  general  first-order  PDE  in  the  form 

du 

—  -f  v  •  Vm  -|-  u)  =  0,  (3.22) 

dt 

with  v  =  v(t,  x)  and  w  =  w(t,  x,u).  The  characteristics  associated  to  this  equation 
are  by  definition  the  solutions  of 

d^(t)  =  v(t,x(t)).  (3.23) 

dt 
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Theorem  2.4  guarantees  that  characteristics  exist  in  the  neighborhood  of  each  start¬ 
ing  point  (to,  xo)  provided  v(t,  x)  and  its  partial  derivatives  with  respect  to  x  are 
continuous. 

The  Lagrangian  derivative  of  u  along  x(t)  is  defined  as  before  by 

Du  d 

—  ( t )  :=  —u(t,x(t)). 

Dt  dt 

The  higher-dimensional  version  of  Theorem  3.2  is  the  following: 

Theorem  3.5  On  each  characteristic  curve,  the  PDE  (3.22)  reduces  to  the  ODE 

Du 

- \-  w  =  0,  (3.24) 


where  w  denotes  the  restriction  of  w  to  the  characteristic.  In  particular,  if  w  =  0 
then  u  is  constant  on  each  characteristic. 

Proof  By  the  chain  rule, 


Du 

~Dt 


du 

—  (t,x(t))  +  Vu(t,x(t)) 
ot 


dx 


Since  x(t)  satisfies  (3.23),  this  gives 

Du 
~Dt  ' 


du 

—  T  r  •  Vm. 
dt 


Substituting  this  into  (3.22)  reduces  the  equation  to  (3.24). 
If  w  =  0  the  equation  becomes 


Du 

~Dt 


which  means  precisely  that  u  is  constant  along  the  characteristic  curves.  □ 

Example  3. 6  Consider  a  two-dimensional  channel  modeled  as  Q  —  R  x  [-1,1] 
with  coordinates  x  =  (x\,  xf).  The  velocity  field 


v(t,  x)  :=  (1  —  x\,  0). 


(3.25) 


is  solenoidal  and  vanishes  on  the  boundary  {x2  =  ±1}.  The  characteristic  line  orig¬ 
inating  from  (a ,  b)  e  £2  at  t  =  0  is 

x(t)  =  (a  +  (1  —  b2)t ,  h ). 

Let  us  consider  the  conservation  equation  (3.21)  for  (t,  x)  e  R  x  £2,  with  v  given 
by  (3.25),  subject  to  the  initial  condition 
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Fig.  3.5  Evolution  of  a  circle  according  to  the  two-dimensional  flow  in  Example  3.6 


m(0,  jet)  =  g(x), 

for  g  e  Since  v  is  solenoidal,  Theorem  3.5  implies  that  u  is  constant  on 

characteristics.  This  gives  the  relation 

u(t,  a  +  (1  —  b2)t ,  b)  :=  g(a,  b). 

Rewriting  this  as  a  function  of  (t ,  x,  y)  gives 

u(t,  x,  y)  =  g(x  -  (1  -  b2)t ,  y). 

Figure  3.5  illustrates  the  evolution  of  a  circular  “ink  spot”  distribution  under  this 
flow.  Conservation  of  mass  is  reflected  in  the  fact  that  the  area  of  the  spot  is  inde¬ 
pendent  off.  o 

For  applications  of  Theorem  3.5  on  a  bounded  domain  Q  e  Mw,  the  specification 
of  boundary  conditions  can  be  quite  a  complicated  problem,  especially  if  the  velocity 
is  time-dependent.  (We  avoided  this  problem  in  Example  3.6  by  taking  v  tangent  to 
d£2.)  We  will  illustrate  this  issue  in  the  exercises. 


3.4  Quasilinear  Equations 

The  method  of  characteristics  remains  an  important  tool  for  analysis  of  first-order 
PDE  even  in  the  nonlinear  case.  In  this  section  we  will  illustrate  the  application  of 
this  method  to  the  continuity  equation  (3.20)  in  the  case  of  a  flux  term  q  that  depends 
on  the  concentration  u. 

To  simplify  the  analysis,  we  assume  that  q  =  q(u),  with  no  explicit  dependence 
on  t  and  v.  By  the  chain  rule,  (3.20)  then  reduces  to  the  form 

du 


+  a (u)  •  Vw  =  0, 


(3.26) 
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where  a (u)  :=  dq/du.  This  type  of  PDE  is  called  quasilinear,  which  means  that  the 
equation  is  linear  in  the  highest-order  derivatives  (which  are  merely  first  order  in  this 
case). 

A  comparison  of  (3.26)  to  the  linear  conservation  equation  (3.21)  shows  that  a (u) 
is  now  playing  the  role  of  velocity.  This  suggests  a  definition  for  the  characteristics, 
but  we  must  keep  in  mind  that  a (u)  depends  on  t  and  x  implicitly  through  u. 

Theorem  3.7  Suppose  that  u  e  C!([0,  T]  x  §2)  is  a  solution  of  (3.26)  for  some 
region  Q  C  W1,  with  a  e  C!(M;  W2).  Then  for  each  Xo  e  T2,  u  is  constant  along  the 
characteristic  line  defined  by 

x(t)  =  xo  +  a(w(0,  xo))t. 

Proof  Suppose  that  a  solution  u  exists.  Let  x(t)  be  the  solution  to  the  ODE 

dx 

—  (0  =  a (u(t,  x(f))),  *(0)  =  x0, 

dt 

for  t  e  [0,  T].  Existence  of  such  a  characteristic  is  guaranteed  by  Theorem  2.4,  at 
least  for  t  near  0,  because  the  composition  a  o  u  is  C1  as  a  function  of  (t,  x )  by  the 
assumptions  on  a  and  u. 

To  establish  the  claim  that  u(t,x(t))  is  independent  of  t,  we  use  the  chain  rule  to 
differentiate 


d 

dt 


u(t ,  x(0) 


du 

~dt 

du 

~dt 


(, t ,  x(t))  +  Vu(t,  x(t))  • 
(t,  x(0)  +  a (u(t,  x(0)) 


•  Vu(t,  x(t)). 


The  right-hand  side  vanishes  by  (3.26),  so  that 


d 

dt 


u(t,  x(t))  =  0. 


This  implies  that 


u(t ,  x(0)  =  w(0,  Xo), 

which  means  that  a(u(t ,  x (t)))  is  also  constant.  The  characteristic  equation  reduces 
to 

dx 

—  (0  =  a(w(0,x0)), 

dt 

and  we  can  integrate  over  t  to  compute  x(t).  □ 

In  contrast  to  the  characteristic  equation  (3.8)  in  the  linear  case,  the  equation  for 
x(t)  here  depends  on  the  initial  condition  u( 0,  xo).  Furthermore,  it  is  important  to 
keep  in  mind  that  Theorem  3.7  does  not  imply  that  a  solution  to  (3.26)  exists;  this 
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is  assumed  as  a  hypothesis.  As  we  will  see  below,  it  is  possible  that  the  conclusion 
of  the  theorem  will  lead  to  a  contradiction,  in  the  form  of  multiple  values  for  the 
solution  at  the  same  point.  The  implication  in  such  a  case  is  that  a  classical  solution 
does  not  exist. 

To  illustrate  the  application  of  Theorem  3.7,  let  us  consider  a  simple  model  for 
traffic  on  a  single-lane  road  of  infinite  length,  parametrized  by  x  e  R.  Let  u(t,  x) 
denote  the  linear  density  of  cars  at  a  given  point  and  time.  Cars  are  discrete  objects, 
of  course,  but  for  modeling  purposes  we  can  assume  that  u  is  a  C1  function  that 
describes  the  density  in  an  aggregate  sense. 

In  traffic  flow,  the  density  of  cars  affects  the  flow  velocity,  with  traffic  slowing 
down  and  possibly  stopping  as  the  density  increases.  A  standard  way  to  model  this 
effect  is  to  set  a  maximum  value  for  the  velocity  vm  (presumably  the  speed  limit). 
The  velocity  is  assumed  to  take  its  maximum  value  at  u  =  0  and  decrease  linearly 
as  u  increases,  up  to  some  maximum  value  um  for  which  v  =  0.  In  other  words,  for 
this  model  u  e  [0,  um]  and 


/  u 
v(u)  :=  vm  II - 

\ 

Since  v  >  0,  the  model  always  assumes  that  traffic  moves  to  the  right. 

To  eliminate  the  constants  and  focus  on  the  equation  itself,  let  us  set  vm  =  1  and 
um  =  1,  reducing  the  velocity  equation  to 


v(u)  =  1  —  u 


for  u  e  [0,  1].  The  corresponding  flux  is 

q(u)  =  u  —  u2. 


Substituting  these  assumptions  into  (3.26),  we  obtain  a  quasilinear  equation  called 
the  traffic  equation : 


du 

~dt 


+  (1 


(3.27) 


Suppose  we  impose  a  general  initial  condition  of  the  form 


u( 0,  x)  =  h(x), 


for  some  h  :  R  — >  [0,  1].  Assuming  a  solution  exists,  Theorem  3.7  gives  the  family 
of  characteristics 

x(t)  =  vo  +  (1  —  2h(xo))t.  (3.28) 

Therefore,  the  solution  u  must  satisfy 


u(t,  xq  +  (1  —  2h(xo))t)  =  h(xo). 


(3.29) 
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Fig.  3.6  Initial  traffic 
density  modeling  a  line  of 
cars  stopped  at  a  traffic  light 


As  we  will  demonstrate  in  the  examples  below,  (3.29)  leads  to  a  solution  formula  for 
some  choices  of  h,  while  for  others  it  leads  to  a  contradiction. 

Example  3.8  Figure  3.6  shows  a  plot  of  the  initial  condition 

1  1 

h(x)  = - arctan(20v), 

2  Tt 

which  could  represent  a  line  of  cars  stopped  at  a  traffic  light  at  the  point  v  =  0.  The 
corresponding  characteristic  lines  as  given  by  (3.28)  are  plotted  in  Fig.  3.7. 

To  derive  a  formula  for  u(t,  x)  from  (3.29),  we  need  to  invert  the  equation 

v  =  v0  +  (1  —  2  h(x0))t, 

to  express  Vo  as  a  function  of  t  and  v.  For  the  function  h  given  above  it  is  not  possible 
to  do  this  explicitly.  However,  there  is  a  unique  solution  for  each  (t ,  v),  which  can 
easily  be  calculated  numerically.  The  resulting  solutions  are  shown  in  Fig.  3.8. 

□ 

Example  3. 9  In  order  to  solve  the  traffic  equation  explicitly,  let  us  simplify  the  initial 
condition  to  the  piecewise  linear  function 

v  <  0, 

v,  0  <  v  <  1, 

v  >  1. 

This  is  not  C1,  but  the  resulting  solution  could  be  interpreted  as  a  weak  solution  in 
the  sense  described  in  Sect.  1.2.  We  will  discuss  the  precise  definition  in  Chap.  10. 
By  the  formula  from  Theorem  3.7,  the  characteristic  lines  are 

I*o  -t,  Xo  <  0, 

vo  +  (2vo  —  1  )t,  0  <  vo  <  1,  (3.30) 

v0  +  t,  v0  >  1. 


Solving  these  equations  for  vq  gives 
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Fig.  3.7  Characteristic  lines  for  the  initial  density  shown  in  Fig.  3.6 


X  <  —t, 

—t<x<l+t, 

X  >  1  H-  t. 

Therefore,  by  the  solution  formula  (3.29),  the  solution  is 

II,  x  <  —t, 

1  —  —t<x<l+t,  (3.31) 

0.  x  >  1  T  t. 

This  is  a  continuous  function,  but  differentiability  fails  on  the  lines  x  =  —t  and 
x  =  l  +  t.  Away  from  these  lines  it  is  easy  to  check  that  u  solves  (3.27). 

Despite  the  lack  of  smoothness,  this  solution  is  quite  reasonable.  To  illustrate 
this,  let  us  trace  the  motion  of  a  particular  car  starting  from  the  position  v0  <  0.  The 
velocity  of  the  car  is  given  by  the  flow  rate  v(u)  =  1  —  u.  The  initial  density  at  xo  is 
u  =  1,  so  the  car  is  stationary  for  a  time.  According  to  (3.31),  at  t  =  —  Vo  the  value 
of  (t,  x)  enters  the  region  where  —t  <  x  <  1  +t  and  so  at  this  time  the  density  starts 
to  decrease  and  the  car  starts  to  move.  For  (t ,  x)  in  this  range,  (3.31)  gives 


*o  = 


x  +  t, 

x+t 
1+2 1  ’ 

X  —  t. 
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Fig.  3.9  Trajectories  of 
individual  cars  according  to 
the  model  of  Example  3.9 


v(t,  x)  =  1  —  u(t,  x)  ~ 


x  + 1 

1+2 1' 


(3.32) 


Let  s(t)  denote  the  position  of  the  car  at  time  t.  For  t  > 
(3.32)  gives  the  equation 


ds  s  + 1 


dt  1  +  2 1 


—vo  the  velocity  formula 

(3.33) 


The  initial  condition  at  t  =  —  Vo  is  the  original  starting  point  s(—x o)  =  xq.  The 
standard  ODE  method  of  integrating  factors  can  be  used  to  solve  (3.33),  yielding 


1  +  t  —  (1  —  2vq)(1  +2 1), 


0  <  t  <  —vo, 
t  >  —Vq. 


These  trajectories  are  illustrated  in  Fig.  3.9.  As  we  might  expect,  the  cars  further  back 
in  the  line  wait  longer  before  moving,  but  each  car  eventually  moves  forward  and 
gradually  accelerates.  0 

Example  3.10  Consider  the  initial  condition 

1  1 

h(x)  =  — | - arctan(20v), 

2  Tt 

as  shown  in  Fig.  3.10.  This  is  the  reverse  of  the  initial  condition  of  Example  3.8. 
The  characteristics  specified  in  Theorem  3.7  now  cross  each  other,  as  illustrated  in 
Fig.  3.11.  The  existence  of  crossings  implies  that  a  classical  solution  with  this  initial 
condition  cannot  exist  beyond  the  time  of  the  first  crossing. 

If  we  were  to  trace  the  trajectories  of  individual  cars,  as  we  did  in  Example  3.9, 
we  would  see  that  these  also  intersect  each  other  at  the  points  where  characteristics 
cross.  In  effect,  the  model  predicts  the  formation  of  a  traffic  jam.  0 
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Fig.  3.10  Initial  traffic 
density  with  a 
near-maximum  density  of 
cars  to  the  right 


Fig.  3.11  Conflicting 
characteristic  lines  for  the 
initial  density  shown  in 
Fig.  3.10 


A  crossing  of  characteristics  as  observed  in  Example  3.10  is  called  a  shock.  After 
the  shock,  the  solution  is  forced  to  have  discontinuities.  The  proper  interpretation  of 
this  situation  requires  weak  solutions,  for  which  discontinuities  are  allowed.  We  will 
return  to  this  issue  in  Chap.  10. 

3.5  Exercises 


3.1  Consider  the  conservation  equation  with  a  constant  velocity  c  >  0, 

du  du 

—  T  c —  =  0, 
dt  dx 


on  the  quadrant  t  >  0,  v  >  0.  Suppose  the  boundary  and  initial  conditions  are 


u( 0,  x)  =  g(x),  x  >  0, 
u(t ,  0)  =  h{t),  t  >  0, 


for  g,h  e  Cl[0,  oo). 
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(a)  Find  a  formula  for  the  solution  u(t,  x)  in  terms  of  g  and  h. 

(b)  Find  a  matching  condition  for  g  and  h  that  will  ensure  that  u(t,x)  is  a  C1 
function. 


3.2  In  the  continuity  equation  (3.4),  external  factors  that  break  the  conservation  of 
mass  are  accounted  for  by  adding  terms  to  the  right-hand  side. 

(a)  A  forcing  term  f(t,x )  is  independent  of  the  existing  concentration.  (In  the 
bloodstream  model  of  Sect.  3.1,  this  could  represent  intravenous  injection,  for 
example.)  Assume  that  c  is  constant,  /  e  C  ^M2),  and  g  e  C!(M).  Solve  the 
equation 

du  du 

—  +  c  —  =  /,  u( 0,  x)  =  g(x), 

at  ox 

to  find  an  explicit  formula  for  u(t,  x)  in  terms  of  /  and  g. 

(b)  A  reaction  term  depends  on  the  concentration  u.  The  simplest  case  is  a  linear 
term  yu  where  the  coefficient  is  some  function  y(t,  x).  (This  could  represent 
absorption  of  oxygen  into  the  walls  of  the  artery,  for  example.)  Assume  that  c  is 
constant,  y  e  C!(M2),  and  g  e  C!(M).  Solve  the  equation 

du  du 

—  +  c—  =  yu ,  u( 0,  x)  =  g(x), 

to  find  an  explicit  formula  for  u(t,  x)  in  terms  of  y  and  g. 


3.3  Assume  that  u  satisfies  the  linear  conservation  equation 


du  du 

—  +2t—  =  0, 
dt  dx 


for  t  g  R  and  v  e  [0,  1].  Suppose  the  boundary  conditions  are  given  by 


u(t,  0)  =  ho(t),  u(t,  1)  =  h\(t). 

Find  a  relation  between  ho  and  h  \ .  (This  shows  that  we  can  only  impose  a  boundary 
condition  at  one  side  of  the  interval  [0,  1].) 

3.4  If  the  spatial  domain  in  the  linear  conservation  equation  (3.21)  is  a  bounded 
region  Q  C  R",  then  for  a  given  velocity  field  r,  the  inflow  boundary  dfim  e  dfi 
is  defined  as  the  set  of  boundary  points  where  v  points  into  Q .  Fixing  boundary 
conditions  on  the  inflow  boundary  will  generally  determine  the  solution  in  the  interior. 
Suppose  Q  =  (—1,1)  x  (—1,1)  g  M2  with  coordinates  (x\,  X2).  For  the  velocity 
fields  below,  determine  the  characteristics  and  specify  the  inflow  boundary.  Draw  a 
sketch  of  £2  for  each  case,  indicating  these  features. 
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(a)  v(xi,x2)  =  (x2, 1). 

(b)  v(xux2)  =  (1,  -x2). 

3.5  Suppose  that  a  section  of  of  a  river  is  modeled  as  a  rectangle  £2  =  (0,  t)  x 
(0,1)  C  M2,  parametrized  by  (x\,  X2).  Assume  the  flow  is  parallel  to  the  vi-axis, 
with  velocity 

v(xi,x2)  =  (/(*2),  0), 


for  some  positive  function  /  on  (0,  1).  Assume  also  that  the  concentration  on  the 
left  boundary  {xi  =  0}  is  given  by 


u{t ,  0,  x2)  =  h(t,  x2). 


Find  a  formula  for  u(t,  x\,  x2)  in  terms  of  the  functions  h  and  /. 


3.6  Burgers  ’  equation  is  a  simple  quasilinear  equation  that  appears  in  models  of  gas 
dynamics, 


du  du 

—  H-  U  — 

dt  dx 


(a)  Use  the  method  of  characteristics  as  described  in  Sect.  3.4  to  find  a  formula  for 
the  solution  u(t,  x)  given  the  initial  condition 


u( 0,  x) 


0,  v  <  0, 

0  <  x  <  a, 
1,  x  >  a. 


(b)  Suppose  a  >  b  and 


I  a,  x  <  0, 

a(l  —  x)  +  bx,  0  <  v  <  1, 
b,  x  >  1. 

Show  that  all  of  the  characteristics  originating  from  vo  €  [0,  1]  meet  at  the  same 
point  (thus  creating  a  shock). 


3.7  In  the  mid-  19th  century,  William  Hamilton  and  Carl  Jacobi  developed  a  formu¬ 
lation  of  classical  mechanics  based  on  ideas  from  geometric  optics.  In  this  approach 
the  dynamics  of  a  free  particle  in  R  are  described  by  a  generating  function  u(t,  x) 
satisfying  the  Hamilton- Jacobi  equation : 


du  1 

Jt  +  2 


dx 


(3.34) 
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Assume  that  u  e  (^([O,  oo)  x  R”)  is  a  solution  of  (3.34).  By  analogy  with  Theo¬ 
rem  3.7,  a  characteristic  of  (3.34)  is  defined  as  a  solution  of 


dx 

dt 


du 

dx 


(t,x(t)), 


(3.35) 


(a)  Assuming  that  x(t)  solves  (3.35),  use  the  chain  rule  to  compute  d2x/dt2. 

(b)  Differentiate  (3.34)  with  respect  to  v  and  then  restrict  the  result  to  (t,x(t)), 
where  x(t)  solves  (3.35).  Conclude  from  (a)  that  to 


d2x 
dt 2 


Hence,  for  some  constant  vq  (which  depends  on  the  characteristic), 


x(t)  =  xq  +  no  t. 


(c)  Show  that  the  Lagrangian  derivative  of  u  along  x(t)  satisfies 


Du 

~Dt 


implying  that 

1  9 

u {t ,  v0  +  Vo t)  =  u( 0,  V0)  +  -V0t. 

(d)  Use  this  approach  to  find  the  solution  u(t,x)  under  the  initial  condition 

u( 0,  v)  =  v  . 

(For  the  characteristic  starting  at  (0,  vo),  note  that  you  can  compute  no  by  eval¬ 
uating  (3.35)  at  t  =  0.) 


Chapter  4 

The  Wave  Equation 


As  we  noted  in  Sect.  1.2,  d’Alembert’s  derivation  of  the  wave  equation  in  the  18th 
century  was  an  early  milestone  in  the  development  of  PDE  theory.  In  this  chapter 
we  will  develop  this  equation  as  a  model  for  the  vibrating  string  problem,  and  derive 
d’Alembert’s  explicit  solution  in  one  dimension  using  the  method  of  characteristics 
introduced  in  Chap.  3. 

In  higher  dimensions  the  wave  equation  is  used  to  model  electromagnetic  or 
acoustic  waves.  We  will  discussion  the  derivation  of  the  acoustic  model  later  in 
Sect.  4.5.  A  clever  reduction  trick  allows  the  solution  formula  for  M77  to  be  deduced 
from  the  one-dimensional  case.  The  resulting  integral  formula  yields  insight  into  the 
propagation  of  waves  in  different  dimensions. 

The  chapter  concludes  with  a  discussion  of  the  energy  of  a  solution,  based  on  the 
physical  principles  of  kinetic  and  potential  energy. 


4.1  Model  Problem:  Vibrating  String 

Consider  a  flexible  string  that  is  stretched  tight  between  two  points,  like  the  strings 
on  a  violin  or  guitar.  The  stretching  of  the  string  creates  a  tension  force  T  that  pulls 
in  both  directions  at  each  point  along  its  length.  For  simplicity,  let  us  assume  that 
any  other  forces  acting  on  the  string,  including  gravity,  are  negligible  compared  to 
the  tension.  The  linear  density  of  mass  p  is  taken  to  be  constant  along  the  string. 

For  a  violin  string  it  is  also  reasonable  to  assume  that  the  displacement  of  the 
string  is  extremely  small  relative  to  its  length.  This  assumption  justifies  taking  T  to 
be  a  fixed  constant,  ignoring  the  additional  stretching  that  occurs  when  the  string 
is  displaced.  It  also  allows  us  to  treat  horizontal  and  vertical  components  of  the 
displacement  independently,  so  we  can  restrict  our  attention  to  the  vertical. 


The  original  version  of  the  book  was  revised:  Belated  corrections  from  author  have  been  incorpo¬ 
rated.  The  erratum  to  the  book  is  available  at  https://doi.org/10.1007/978-3-319-48936-0_14 
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4  The  Wave  Equation 


Let  the  string  be  parametrized  by  x  e  [0,1].  The  vertical  displacement  as  a 
function  of  time  is  denoted  by  u  (t,  x) .  To  develop  an  equation  for  u ,  we  first  discretize 
the  model  by  subdividing  the  total  length  i  into  segments  of  length  Ax  =  i/n  for 
some  large  n.  Each  segment  has  a  mass  pAx  and  is  subject  to  the  tension  forces 
pulling  in  the  direction  of  its  neighbors  on  either  side. 

For  j  =  0,  . . .  n,  let  xj  :=  j  Ax  be  the  position  of  the  jth  segment  along  the  string. 
The  segments  j  =  0  and  j  =  n  represent  the  fixed  endpoints,  with  j  =  1 , . . . ,  n  —  1 
in  the  interior.  Let  u(t,  Xj)  denote  the  vertical  displacement  of  the  j th  segment  as  a 
function  of  time.  Figure  4.1  illustrates  this  discretization  (with  displacements  greatly 
exaggerated). 

To  develop  an  equation  for  the  string,  we  apply  Newton’s  laws  of  motion  to  the 
segments  of  the  discretization,  as  if  they  were  single  particles.  The  jth  particle  is 
being  pulled  by  its  neighbors  with  a  force  T  on  each  side.  Unless  the  string  is  straight, 
these  forces  are  not  quite  aligned. 

In  terms  of  the  angles  labeled  in  Fig.  4.2,  the  net  vertical  force  on  a  single  segment  is 

AF(t,xj )  =  T  shmj  +  T  sin/?,. 

We  have  assumed  that  the  relative  displacements  are  extremely  small,  so  the  angles 
aj ,  [3j  will  be  very  small  also.  To  leading  order,  we  can  replace  the  sines  by  tangents, 
which  are  linear  in  u. 


sin  aj  ~ 


U(t,  Xj- 1)  —  U(t,  Xj) 


^  _  J-U  -V*,-.//  a  _  u(t,Xj  +  1)  -  u(t,Xj) 


Ax 


,  sin  f3 


Ax 


With  this  linear  approximation,  the  net  vertical  force  at  the  point  Xj  becomes 


T 


AF(t,  Xj)  =  — [u(t,  Xj+ 1)  +  u(t,  Xj- 1)  —  2u(t,  Xj)]. 

dr 


(4.1) 


Fig.  4.1  Discrete  model  for  the  displacement  of  the  string 


Fig.  4.2  Discrete  model  for  the  displacement  of  the  string 
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The  equation  of  motion  for  the  j  th  segment  now  comes  from  Newton’s  law:  mass 
times  acceleration  equals  force.  At  the  point  xj  this  translates  to 


d2u 

pAx  —  (t,Xj)  =  AF(t,Xj). 


(4.2) 


Using  (4.1)  on  the  right  then  gives 


d2u  T  u(t,  Xj+i)  +  u(t,  Xj-i)  —  2u(t,  Xj) 

=  7 ^ - 


(4.3) 


The  final  step  is  to  take  the  continuum  limit  n  — >  oo  and  Ax  — >  0.  Assuming 
that  u  is  twice  continuously  differentiable  as  a  function  of  x,  we  can  deduce  from 
the  quadratic  Taylor  approximation  of  u(t,  x)  that 


lim 


u(t,  x  +  Ax)  +  u(t ,  x  —  Ax)  —  2 u(t,  x)  d 2 


u 


(Ax)‘ 


dx 2 


(t,  x), 


Hence,  taking  Ax  — >  0  in  (4.3)  gives 


d2u  T  d2u 
dt 2  p  dx 2 


(4.4) 


This  is  the  one-dimensional  wave  equation.  The  fixed  ends  of  the  string  correspond 
to  Dirichlet  boundary  conditions, 

u(t,  0)  =  u(t ,  i)  =  0. 


4.2  Characteristics 


For  convenience,  set  c2  \=  T / p  in  (4.4),  assuming  c  >  0,  and  rewrite  the  equation 
as 


d2 


u 


d2 


u 


dt 2 


—  c 


dx2 


=  0. 


(4.5) 


The  constant  c  is  called  the  propagation  speed ,  for  reasons  that  will  become  apparent 
as  we  analyze  the  equation. 

Let  the  physical  domain  be  x  e  R  for  the  moment;  we  will  discuss  boundary 
conditions  later.  The  key  to  applying  the  method  of  characteristics  to  (4.5)  is  that  the 
differential  operator  appearing  in  the  equation  factors  as  a  product  of  two  first-order 
operators,  i.e., 


d2 

dt2 


& 


—  c 


dx2 


d  d 
+  c 


d  d 
—  c 


dt  dx  )  \  dt  dx 


(4.6) 
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Individually,  these  operators  have  characteristic  lines  t  i->  xo  =b  ct.  Both  sets  of 
characteristics  will  play  an  important  role  here. 

Theorem  4.1  Under  the  initial  conditions 

w(0,  x)  —  g(x),  f(0 ,x)  =  h(x),  (4.7) 

at 

for  g  eC2  (R)  and  h  e  C^R),  the  wave  equation  (4.5)  admits  a  unique  solution 


1  r 

u(t,  x)  =  -  g(x  +  ct)  +  g(x  -  ct) 

^  L_ 


+  f  f  h(T)  dr. 

Jx—ct 


x+ct 


(4.8) 


Proof  Consider  the  auxiliary  function  w(t,  x)  defined  by 


du  du 
w  :=  — - c 


dt 


dx 


(4.9) 


By  (4.6),  w  satisfies  the  linear  conservation  equation 

dw  dw 

— — |-  c—~  =  0. 
dt  dx 

The  characteristics  for  this  equation  are  given  by  x+(t)  =  xo  +  ct.  By  Theorem  3.2 
the  unique  solution  with  an  initial  condition  w(0,  x)  =  wq(x)  is 

w(t,  x)  =  wo(x  —  ct).  (4.10) 


We  will  relate  wo  back  to  the  initial  conditions  g  and  h  in  a  moment. 

With  w  given  by  (4. 10),  the  definition  (4.9)  can  be  regarded  as  a  linear  conservation 
equation  for  u , 


du  du 
dt  dx 


=  w, 


(4.11) 


where  w  acts  as  a  forcing  term  as  described  in  Exercise  3.2.  The  characteristics  of 
(4.11)  are  X-(t)  =  xo  —  ct.  By  Theorem  3.2,  we  can  thus  reduce  the  equation  to  the 
form 


d 

—  u(t,  xo  —  ct)  =  w(t,  xo  —  ct).  (4.12) 

dt 


The  unique  solution  to  (4.12)  under  the  initial  condition  u(0,  x)  =  g(x)  is  given 
by  direct  integration  with  respect  to  time: 


u(t,  x o 


ct)  =  g(x0)  +  /  w(s,x o 

Jo 


cs)  ds. 
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Setting  x  =  xq  —  ct  then  gives 


/ 


u(t,  x)  =  g(x  +  ct)  +  /  w(s ,  v  —  c(s  —  t ))  dv 

f0 


Using  the  formula  (4.10)  for  the  solution  w,  we  obtain 


/ 


u(t ,  v)  =  g(v  +  cf)  +  /  u;o(*  —  2cs  +  cf)  ds. 

f0 


As  a  final  step,  the  substitution  r  \=  x  +  ct  —  2c s  gives 


u(t,x)  =  g(x  +ct)  +  f 

2c  7, 


x+c/ 


iuo(r)  <Jt. 


x—ct 


(4.13) 


The  function  wo  can  be  computed  from  the  initial  conditions  (4.7), 

du  du 

m(x)  :=  —  (0,  v)  -  c  —  ( 0,  v) 

dg 

=  fc(*)  -  c-^-(x). 
ox 

The  u>o  contribution  to  (4.13)  is  then  given  by 


^  nx+ct  y  rx-\-ci  y  r 

-  u>o(t)  dr  =  —  I  h(r)  dr  —  -  I 
-C  Jx—Ct  2c  J x—ct  2  Jx 


x+ct  i  nx+ct 


2  c 


x—ct 


dg_ 

dx 


(r)  dr 


rx+c 

2c  Jx—ct 


x—ct 

X+Ct 

h(j)  dr  -  ~[g{x+ct)  -  g(x  -  ct)]. 


Substituting  back  into  (4.13)  now  gives  the  formula  (4.8). 


□ 


To  highlight  the  role  played  by  the  characteristic  lines  in  the  solution  of  Theo¬ 
rem  4.1,  consider  the  functions 

1  1  fx 

U±(x)  :=  -g(x)  T  T-  /  h(T)  dr. 

2  2c  Jo 


In  terms  of  u±,  the  solution  (4.8)  simplifies  to 


u(x,  t)  =  u+(x  —  ct)  +  U-(x  +  ct), 


(4.14) 


matching  the  form  of  the  solution  stated  in  (1.3).  The  subscripts  in  u±  indicate  the 
propagation  direction,  i.e.,  u+  propagates  to  the  right  and  u_  to  the  left.  In  either 
direction  the  speed  of  propagation  is  the  parameter  c. 

Example  4.2  Consider  the  wave  equation  (4.5)  with  the  initial  conditions  h(x)  =  0 
and 
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Fig.  4.3  Evolution  of  a 
solution  to  the  wave  equation 


X2)2,  \x\  <  1, 
\x\  >  1. 


By  (4.8)  the  solution  is  the  superposition  of  two  localized  bumps,  which  propagate 
in  opposite  directions  as  illustrated  in  Fig.  4.3.  0 

In  Example  4.2  the  initial  condition  was  supported  in  [—1,  1],  and  we  can  see 
in  Fig.  4.3  that  the  resulting  solution  has  support  in  a  V-shaped  region.  This  region 
could  be  identified  as  the  span  of  the  characteristic  lines  emerging  from  the  initial 
support  interval. 

This  restriction  of  the  support  of  a  solution  is  closely  related  to  Huygens'  principle, 
an  empirical  law  for  propagation  of  light  waves  published  by  Christiaan  Huygens 
in  1678.  The  one-dimensional  wave  equation  exhibits  a  special,  strict  form  of  this 
principle: 

Theorem  4.3  (Huygens’  principle  in  dimension  one)  Suppose  u  solves  the  wave 
equation  (4.5)  for  t  >  0,  i  G  I,  with  initial  data  given  by  (4.7).  If  the  functions  g,  h 
are  supported  in  a  bounded  interval  [a,  b],  then 

supp  u  C  |(C  x)  G  M+  x  R;  x  e  [a  —  ct ,  b  +  ct]  J. 

Proof  Consider  the  components  of  the  solution  (4.8).  The  g  term  will  vanish  unless 
x  ±ct  G  [a,  b].  The  support  of  this  term  is  thus  restricted  to  x  e  [a  —  ct,  b  —  ct]  or 
v  G  [a  4~ct,  b  4-ct  ] .  As  for  the  h  term,  the  integral  over  r  will  vanish  unless  the  interval 
[x  —  ct,  x  +  ct]  intersects  [a,  b],  which  occurs  only  when  x  e  [a  —  ct ,  b  4~  ct].  □ 

The  restriction  of  support  described  in  Theorem  4.3  is  illustrated  in  Fig.  4.4.  The 
term  g  contributes  only  in  the  regions  shown  in  blue,  but  the  h  term  may  con¬ 
tribute  throughout  the  full  support  region.  However,  the  solution  is  constant  (equal  to 

h(r)dr  when  [a,  b]  is  contained  in  [x  —  ct,  x  +  ct].  This  constant  region  shown 
in  purple  in  Fig.  4.4. 

Example  4.4  Suppose  the  initial  data  from  Example  4.2  are  altered  to  include  a 
singularity  at  v  =  0.  For  example, 


4.2  Characteristics 
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Fig.  4.4  Support  of  a  wave 
solution  with  initial  data  in  a 
bounded  interval 


Fig.  4.5  Propagation  of 
singularities  of  the  wave 
equation  along  characteristic 
lines 


x\ )2,  \x\  <  1, 
\x\  >  1. 


Then  (4.8)  still  gives  a  formula  for  the  solution  even  though  g  is  not  differentiable. 
(This  is  a  weak  solution  in  the  sense  we  will  describe  in  Chap.  10).  A  set  of  solutions 
at  different  points  in  time  is  plotted  in  Fig.  4.5.  Observe  that  the  original  singularity 
splits  into  two  singularities,  which  propagate  outward  along  the  two  characteristic 
lines  emanating  from  x  =  0.  0 


4.3  Boundary  Problems 

In  the  string  model  of  Sect.  4.1  the  domain  of  the  wave  equation  (4.5)  was  restricted 
to  x  e  [0,  £],  with  Dirichlet  boundary  conditions 


u(t,  0)  =  u(t,  i)  =  0,  for  all  t  >  0. 


(4.15) 


Suppose  the  initial  data  are  given  for  x  e  [0,  £  ]  by 


du 

~dt 


(0,  x)  =  h(x), 


u(  0,  x)  =  g(x), 


(4.16) 
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with  g  e  C2[ 0,  £],  h  e  C1  [0,  £].  Both  g  and  h  are  assumed  to  vanish  at  the  endpoints 
of  [0,1]. 

The  solution  of  the  wave  equation  on  R  provided  in  Theorem  4.1  can  be  adapted 
to  the  boundary  conditions  (4.15).  The  idea  is  to  extend  g,  h  to  R  in  such  a  way  that 
the  formula  (4.8)  gives  a  solution  satisfying  the  boundary  conditions  for  all  t. 

Theorem  4.5  The  wave  equation  (4.5)  on  [0,  £],  with  Dirichlet  boundary  conditions 
and  satisfying  the  initial  conditions  (4.16),  admits  a  solution  of  the  form  (4.8),  only 
if  the  initial  data  extensions  to  R  as  odd,  21 -periodic  functions,  with  g  G  C2(M)  and 
h  e  C\ R). 

Proof  By  linearity  we  can  consider  the  g  and  h  terms  independently.  Assume  that 
the  g  term, 

i[g(x  +ct)  +g(x  -ct)\  (4.17) 

is  defined  for  all  t  and  v  and  satisfies  the  boundary  conditions  on  [0,  £ ]  for  all  values 
of  t.  At  v  =  0  the  condition  u(t,  0)  =0  will  be  satisfied  if  and  only  if 

g(ct)  +  g(—ct)  =  0,  for  all  t  >  0. 

In  other  words,  u(t,  0)  =  0  if  and  only  if  g  is  odd.  At  x  =  £  the  condition  is 

g(£  +  ct)  +  g(£  —  ct)  =  0,  for  all  t  >  0. 


This  is  equivalent  to  the  condition  that  g  is  odd  with  respect  to  reflection  at  the  point 
x  =  £ . 

The  composition  of  the  reflections  about  0  and  £  gives  translation  by  2£.  Hence 
the  expression  (4.17)  satisfies  the  boundary  conditions  if  and  only  if  g  is  odd  and 
2£- -periodic. 

A  similar  argument  works  for  the  h  term, 


u(t,  x) 


1 

2c 


rx+ct 
J  x—ct 


h(r)  dr. 


The  requirement  at  v  =  0  is 


h{r)  dr  =  0, 


for  all  t  >  0. 


(4.18) 


(4.19) 


Differentiation  with  respect  to  t ,  using  the  fundamental  theorem  of  calculus,  shows 
that  (4.19)  is  satisfied  if  and  only  if  h  is  odd  with  respect  to  reflection  at  0.  Similarly, 
the  condition 


h(r)  dr  —  0, 


for  all  t  >  0 


requires  odd  symmetry  with  respect  to  reflection  at  v  =  £ . 


□ 


4.3  Boundary  Problems 
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Example  4.6  Consider  the  vibrating  string  problem  with  c  =  1  and  1  =  1.  Suppose 
that  the  solution  initially  has  the  form  (4.14)  with  u+  =  0  and  the  left-propagating 
solution  given  by  the  function  u_  shown  in  Fig.  4.6.  For  small  t  >  0  the  solution  is 

u(t,  x)  =  U-(x  +  t),  (4.20) 

but  eventually  the  bump  hits  the  boundary  at  v  =  0,  and  we  would  like  to  understand 
what  happens  then. 

To  apply  Theorem  4.5,  we  must  first  solve  for  g  and  h  in  terms  of  u+.  By  (4.20) 
we  set  g(v)  =  u+(x)  and 


h(x)  = 


d 

di 


U-(x  +  t) 


t= 0 


The  resulting  functions  g  and  h,  extended  to  odd  functions  on  R,  are  shown  in  Fig.  4.7. 

According  to  Theorem  4.5  we  can  compute  the  solution  from  (4.8)  using  these 
odd  periodic  extensions  of  g  and  h.  The  results  are  shown  in  Fig.  4.8.  The  bump 
temporarily  disappears  at  t  =  0.3  and  then  reemerges  as  an  inverted  bump  traveling 
in  the  opposite  direction.  0 


4.4  Forcing  Terms 

The  derivation  of  the  string  model  in  Sect.  4.1  assumed  that  no  external  forces  act 
on  the  string.  Additional  forces  could  be  incorporated  by  adding  extra  terms  to  the 
expression  (4.1)  for  the  force  on  a  segment.  In  the  continuum  limit  this  yields  a 


Fig.  4.6  The  initial  waveform  «+ 


Fig.  4.7  The  odd  extensions  of  the  initial  conditions  g  and  h 
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1 


Fig.  4.8  Reflection  of  a  propagating  bump  at  the  endpoint  of  the  string 


forcing  term  on  the  right-hand  side  of  (4.5): 


d2u  2d2u 
dt 2  dx 2 


(4.21) 


where  /  =  The  forcing  term  could  be  used  to  model  plucking  or  bowing  of 

the  string,  for  example. 

In  this  section  we  introduce  a  technique,  called  Duhamel’s  method ,  that  allows  us 
to  adapt  solution  methods  for  evolution  equations  to  include  a  forcing  term.  The  idea, 
which  is  closely  related  to  a  standard  ODE  technique  called  variation  of  parameters, 
is  to  reformulate  the  forcing  term  as  an  initial  condition.  This  technique  is  named 
for  the  19th  century  French  mathematician  and  physicist  Jean-Marie  Duhamel,  who 
developed  the  idea  in  a  study  of  the  heat  equation. 
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Fig.  4.9  Domain  of 
dependence  for  the  point 

(t,x) 


X 


To  focus  our  attention  on  the  driving  term,  let  us  consider  (4.21)  on  the  domain 
v  g  R  with  the  initial  conditions  set  to  zero.  For  a  given  c,  define  the  domain  of 
dependence  of  a  point  (t ,  x)  with  t  >  0  and  x  G  Rby 

Vt  x  :=  {(s,  x')  g  M+  x  R  :  x  —  c(t  —  s)  <  x'  <  x  +  c(t  —  s)} . 

This  is  a  triangular  region,  as  pictured  in  Fig.  4.9.  The  terminology  refers  to  the  fact 
that  the  solution  u(t,  x)  is  influenced  only  by  the  values  of  /  within  Vt  x,  as  the  next 
result  shows. 

Theorem  4.7  For  f  e  C^M),  the  unique  solution  of  (4.21)  satisfying  the  initial 
conditions 

du 

u(0,x)  =  0,  —  (0,  x)  =  0, 

at 


is  given  by 


u(t,  x)  — 


=  —  f  f(s,x')dx'ds. 

2c  J v,,x 


(4.22) 


Proof  For  each  s  >  0,  let  rjs(t,x)  be  the  solution  of  the  homogeneous  wave  equation 
(4.5)  for  t  >  s ,  subject  to  the  initial  conditions 


Vs(t,x)\t=s  =  0, 


% 

dt 


(t,x)  t=s  =  f(s,x). 


(4.23) 


This  function  can  be  written  explicitly  by  shifting  t  to  t  —  s  in  (4.8), 


f>X+c(t—s) 
-I x—c(t—s) 


r 

T]s(t,x)  =  —  f(s,x  )dx  . 

J  x- 


(4.24) 


We  claim  that  the  solution  of  (4.21)  is  given  by  the  integral 


u(t ,  x)  :=  /  r]s(t,  x)  ds. 

Jo 


(4.25) 
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Note  that  the  integration  variable  here  is  s  rather  than  t. 

We  will  first  check  that  this  definition  of  u  satisfies  the  initial  conditions.  For 
t  =  0  the  integral  in  (4.25)  clearly  vanishes,  so  that  u(0,  x)  =  0  is  satisfied.  By  the 
fundamental  theorem  of  calculus,  differentiating  (4.25)  with  respect  to  t  gives 


du 

~dt 


C t,X )  =  T)s(t,X )  s=t  + 


f 


'  drjs 


dt 


( t ,  x)  ds. 


The  first  term  vanishes  for  all  t  by  the  initial  condition  (4.23),  leaving 


du 

~dt 


(t,x) 


(t,  x)  ds, 


for  all  t  >  0.  Setting  t  =  0  gives 


(4.26) 


du 

dt 


x)  =  0. 


Now  let  us  check  that  the  u  defined  in  (4.25)  solves  (4.21).  Differentiating  (4.26) 
once  more  gives 


d2u 
' Ft 2 


(t,x) 


d2r\s 

~dt2 


(t,  x)  ds. 


(4.27) 


By  (4.23)  the  first  term  on  the  right  is  equal  to  f(t,  x).  To  simplify  the  second  term, 
we  use  the  fact  that  r]s  solves  (4.5)  and  the  definition  of  u  to  compute 


d2r]s 

dt2 


(t,  x)  ds  =  c 


2 


=  C 


2 


(, t ,  x)  ds 


Therefore,  (4.27)  reduces  to 


d2u  7  d2u 

dt2  J  dx2. 


proving  that  u  solves  (4.21). 

Combining  (4.24)  and  (4.25),  we  can  write  the  formula  for  u  as 

1  C  /  rx+c(T~s)  \ 

u(t,  x)  =  —  /  I  /  f(s,  xf)  dx'  )  ds, 


which  is  equivalent  to  (4.22). 
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Fig.  4.10  Range  of 
influence  of  the  point  (to,  xq) 


To  prove  uniqueness,  suppose  u\  and  U2  are  solutions  of  (4.21).  Then  U2  —  u\  is 
a  solution  of  (4.5).  Since  U2  —  u\  also  has  vanishing  initial  conditions,  Theorem  4.1 
implies  that  U2  —  u\  =0.  Hence  the  solution  is  unique.  □ 

By  the  superposition  principle,  Theorem  4.7  is  easily  extended  to  the  case  of 
nonzero  initial  conditions,  by  setting  u  =  v  +  w  where  v  is  a  solution  of  the  form 
(4.22)  and  w  is  a  solution  of  the  form  (4.8). 

The  concept  of  domain  of  dependence  still  applies  when  the  initial  conditions  are 
nonzero.  In  the  solution  formula  (4.8),  u(t ,  x)  depends  only  on  the  values  of  g  and 
h  at  the  base  of  the  triangle,  Vt  x  n  {t  =  0}.  Thus  it  is  still  the  case  that  the  solution 
u(t,  x)  depends  only  on  the  data  within  Vt  x. 

The  existence  of  the  domain  of  dependence  is  a  limitation  imposed  by  the  prop¬ 
agation  speed  c.  For  systems  governed  by  the  wave  equations  (4.5)  or  (4.21),  no 
information  can  travel  at  a  speed  faster  than  c. 

The  region  of  the  space-time  plane  in  which  solutions  can  be  affected  by  the  data 
at  a  particular  point  (to,  Vo)  is  called  the  range  of  influence  of  this  point.  By  the 
definition  of  the  domain  of  dependence,  the  range  of  influence  consists  of  the  points 
(t ,  x)  such  that  (to,  *o)  E  Vt  x.  This  region  is  a  triangle  with  vertex  (to,  *o)  and  sides 
given  by  the  characteristics  (t,  Vo  ±  c(t  —  to)),  as  shown  in  Fig.  4. 10. 

Duhamel’s  method  applies  also  to  the  case  of  a  vibrating  string  with  fixed  ends. 
Assuming  that  f(t,x)  satisfies  the  boundary  conditions  at  v  =  0  and  v  =  i ,  we 
extend  /  to  an  odd  2t -periodic  function  on  R,  just  as  in  Theorem  4.5.  This  exten¬ 
sion  guarantees  that  the  intermediate  solution  r]s  defined  by  (4.24)  will  satisfy  the 
boundary  conditions  also.  And  then  so  will  the  solution  u(t,  x)  defined  by  (4.25). 

Example  4.8  Consider  a  string  of  length  i  with  propagation  speed  c  =  1.  Suppose 
the  forcing  term  is  given  by 


f(t,  x)  =  cos(cct)  sin(cjo^),  (4.28) 

where  cjo  :=  tt/€  and  cu  >  0  is  the  driving  frequency.  Since  sin(cjo^)  is  odd  and 
27r-periodic,  the  extension  required  by  Theorem  4.5  is  automatic.  As  in  Theorem  4.7, 
let  us  set  the  initial  conditions  g  =  h  =  0  to  focus  on  the  forcing  term. 

Substituting  (4.28)  into  (4.22)  gives 
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u(t,x)  =  - 


HI 

u; 


t  rx+t—s 


cos(W)  sin^o-^O  dxr  ds 


0  J  x— t+s 

[cos(u;o(x  —  t  +  s))  —  cos(u;o(*  +  t  —  5*))]  cos(W)  ds 


A  trigonometric  identity  reduces  this  to 


sin(cjo^)  f1 

u(t,x)  =  -  /  sin(u;o(^  —  s))  cos(cjs)  ds. 

^0  Jo 


(4.29) 


For  uj  7^  ljq  we  obtain 


sin(c<;o^) 

u(t,  x)  =  — - -  [cos(u;0  —  cos(o;oO]  • 


CJq  ~  UJ1 


Note  that  the  v  dependence  of  the  solution  matches  that  of  the  forcing  term.  The 
interesting  part  of  this  solution  is  the  oscillation,  which  includes  both  frequencies  uj 
and  c^o-  Figure  4.1 1  illustrates  the  behavior  of  the  amplitude  as  a  function  of  time,  in 
a  case  where  uj  ^  uj q.  The  large-scale  oscillation  has  a  period  1  /uj,  corresponding  to 
the  low  driving  frequency.  The  solution  also  exhibits  fast  oscillations  at  the  frequency 
uj o  which  depends  only  on  i. 

For  uj  —  ct^o  the  formula  (4.29)  gives  the  solution 


u(t,  x)  =  - sin(c<;ov)  sin(u;oO- 

2^o 

The  resulting  amplitude  grows  linearly,  as  shown  in  Fig.  4.12.  0 

The  physical  phenomenon  illustrated  by  Example  4.8  is  called  resonance.  If  the 
string  is  driven  at  its  natural  frequency  c^o  then  it  will  continually  absorb  energy  from 
the  driving  force.  Of  course,  there  is  a  limit  to  how  much  energy  a  physical  string 
could  absorb  before  it  breaks.  Once  the  displacement  amplitude  becomes  sufficiently 
large,  the  linear  wave  equation  (4.5)  no  longer  serves  as  an  appropriate  model. 


1  /u 


Fig.  4.11  Oscillation  pattern 
with  a  driving  frequency 
uj  =  uj  q/10 


u 


4.5  Model  Problem:  Acoustic  Waves 


59 


Fig.  4.12  Growth  of  the 
amplitude  at  the  resonance 
frequency  u>  =  ujq 


4.5  Model  Problem:  Acoustic  Waves 

The  vibration  of  a  drumhead  can  be  modeled  on  a  bounded  domain  Q  C  M2,  with 
a  function  u(t,  x)  representing  the  vertical  displacement  of  the  membrane  at  time  t 
and  position  x  e  £2.  With  arguments  similar  to  those  in  Sect.  4.1,  one  can  derive  the 
equation 

—  ~c2Au=0,  (4.30) 

atz 

where  A  is  the  Laplacian  operator  (1.7).  The  wave  equation  (4.30)  appears  in  many 
other  contexts  as  well,  including  the  propagation  of  light  and  all  other  forms  of 
electromagnetic  radiation.  In  all  these  cases  the  constant  c  represents  the  speed  of 
propagation. 

In  this  section  we  will  derive  the  three-dimensional  wave  equation  as  a  model 
for  acoustic  waves  traveling  through  the  air.  Acoustic  waves  consist  of  fluctuations 
of  pressure  which  propagate  through  a  gas.  To  analyze  them,  we  must  consider  the 
relationships  between  the  pressure  P,  the  velocity  field  v,  and  the  density  p.  For  a 
gas  in  motion  these  are  all  functions  of  both  time  and  position. 

Because  acoustic  waves  involve  minute  pressure  fluctuations  with  very  little  heat 
transfer,  the  relationship  between  pressure  and  density  is  given  by  the  adiabatic  gas 
law 

P  =  Cp\  (4.31) 

where  C  and  7  are  physical  constants.  We  will  fix  background  atmospheric  values 
of  the  pressure  Pq  and  density  pQ  and  focus  on  the  deviations 

u  :=  P  —  Pq,  a  :=  p  —  po. 

Applying  (4.31)  to  P / Pq  gives  the  equation 
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Since  a / po  is  assumed  to  be  very  small,  we  can  linearize  by  taking  a  first-order 
Taylor  approximation  on  the  right  side.  This  yields 


7^o 

U  =  - (J. 

Po 


(4.32) 


The  dynamics  of  the  gas  are  modeled  with  two  conservation  laws.  The  first  is 
conservation  of  mass  (3.20),  which  yields 


dp 

~dt 


+  V  •  (pv)  =  0. 


Since  a  and  v  are  both  assumed  to  be  very  small,  for  the  leading  approximation  we 
can  replace  p  by  po  to  obtain 


da 

~dt 


+  p0V  •  v  =  0. 


(4.33) 


The  second  dynamical  law  is  conservation  of  momentum.  This  is  encapsulated  in 
a  fluid  equation  derived  by  Euler  in  1757,  called  Euler’s  force  equation : 


+  v  •  V 


v. 


Euler’s  equation  is  an  aggregate  form  of  Newton’s  second  law  (force  equals  mass 
times  acceleration).  Note  that  the  “acceleration”  term  on  the  right  is  the  Lagrangian 
derivative  of  the  velocity  field  v.  As  above,  we  substitute  P  =  Pq  +  u  and  p  =  po  +  a 
and  keep  only  the  first  order  terms  to  derive  the  linearization 


dv 

~dt' 


(4.34) 


The  final  step  is  to  eliminate  the  velocity  field  from  the  equation.  Substituting 
(4.32)  into  (4.33)  and  differentiating  with  respect  to  time  gives 


d2u 
~dt 2 


(V  •  v). 


(4.35) 


d 

dt 


(V  •  v)  =  V  • 


dv 

~dt 


Vu 

po 


1 

= - A  u. 

po 


On  the  other  hand,  by  (4.34), 
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Substituting  this  in  (4.35)  yields  the  acoustic  wave  equation , 


d2u 
' dt 2 


As  with  our  previous  derivations,  many  approximations  are  required  to  produce  a 
linear  equation.  These  simplifications  are  well  justified  for  sound  waves  at  ordinary 
volume  levels,  but  more  dramatic  pressure  fluctuations  would  require  a  nonlinear 
equation. 


4.6  Integral  Solution  Formulas 


Let  us  consider  the  wave  equation  (4.30)  on  M3  with  c  =  1, 


d2u 

~dt2 


—  A  u  =  0. 


This  problem  can  be  reduced  to  the  one-dimensional  case  by  a  clever  averaging  trick. 
For  /  e  C°(M3),  define 


/(*;  P):=-L  f  f(w)  dS(w ),  (4.36) 

4 JdB(x\p ) 

where  x  e  M3  and  p  >  0.  The  surface  area  of  dB(x;  p)  is  47rp2,  so  /  is  p  times  the 
spherical  average  of  /.  By  continuity  the  spherical  average  approaches  the  value  of 
the  function  at  the  center  point  as  p  — >  0,  so  that 


f(x;  p) 

lim  J  H  =  f{x).  (4.37) 

p~*  o  p 

The  dimensional  reduction  of  the  wave  equation  is  based  on  the  following  formula 
of  Jean-Gaston  Darboux. 

Lemma  4.9  (Darboux’s  formula)  For  f  e  C2(M3), 

d2  ~ 

— /(x;  p)  =  A Xf(x;  p). 

Proof  To  compute  the  radial  derivative  of  the  spherical  average,  it  is  helpful  to  change 
coordinates  by  setting  w  =  x + py ,  so  that  the  domain  of  y  is  the  unit  sphere  §2  C  M3 , 


J_  [ 

4t T J9B(x;p) 


f(x  +  py)dS(y). 


f  (w)  dS(w)  = 
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Differentiation  under  the  integral  gives, 


d 

dp 


f 

Js: 


-=f 

47T  Js 


f(x  +  py)dS(y)—=  V / (x  +  py)  ■  y  dS(y). 


In  the  original  coordinates  this  implies 


d 


bL 


dp  I4np2  JdB(x. 


f(w)dS(w) 


;p) 


If  /  w  —  x  , 

—  /  V/(w)-| -  )  dS(w) 

4 7T/32  JdB(x. 


P ) 


P 


Since  (w  —  jt)/p  is  the  outward  unit  normal  to  dB(x\  p). 


(4.38) 


.  w  —  x\  df 
Vf(w)-l  — — )  =  -±(w). 


Furthermore,  by  Corollary  2.8, 


L 


dB(x;p ) 


df 

dv 


(w)  dS(w)  = 


/ 

J  B(x;p) 


Af(w)  d3w, 


Applying  this  to  the  right-hand  side  of  (4.38)  gives 


d 


hi 


dp  \_4np2  J QB(x;p) 


f(w)dS(w) 


=  —[ 

4t rp2  JB(x. 


Af(w)  d3w. 


P) 


(4.39) 


Substituting  the  definition  of  /  in  (4.39)  yields 


d  , 


dp 


f(x;p)  =  d—rf  /(to)  dS(w)  +  2-  f  Af(w)d3w 

4k p2  JdB(x:o)  47T p  J  B(x;p) 


A  further  differentiation  using  (4.39)  and  the  radial  derivative  formula  from  Exer¬ 
cise  2.4  then  gives 


d2  ~  Id 

p)  = 


dp 


4t Tp  dp  JB(X- 


I  Af(w)d3w 
J  B(x ■ 


=  ~f 

4t rp  Jg 


P) 

A f{w)  dS(w). 


(4.40) 


dB(x;p) 


On  the  other  hand 
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A Xf(x;  /o)  =  A 


t —  /*  /(*  +  /9j)dS(.y) 
L  47rp  y§2 

=  T —f  Af(x  +  py)dS(y) 

47 TP  J §2 

=  -/ 

4tt/>  Jg 


(4.41) 


Af(w)  dS(w). 


dB(x;p ) 


The  claim  thus  follows  from  (4.40). 


□ 


Lemma  4.9  allows  us  to  relate  the  three-dimensional  wave  equation  in  variables 
(Lx)  to  a  one-dimensional  equation  in  variables  (L  p).  The  result  is  a  solution 
formula  first  derived  in  1883  by  the  physicist  Gustav  Kirchhoff. 

Theorem  4.10  (Kirchhoff’s  integral  formula)  For  u  e  C2([0,  oo)  x  M3),  suppose 
that 

d2u 

a?-a"  =  0 

under  the  initial  conditions 


u\t=o  =  g, 


du 

~dt 


=  h, 


t= o 


Then 


d 

u (t ,  x )  =  —g(x‘  t)  +  h(x;  t), 
at 


with  g  and  h  defined  as  in  (4.36). 


Proof  Define 


u(t,  x;  p)  \= 


■=-f 

4t rp  Jd 


u(t,  w )  dS(w), 


dB(x;p) 


Since  u  satisfies  the  wave  equation,  differentiating  under  the  integral  gives 


d2 


dt 2 


u(t,  x;  p)  = 


=  -f 

4t TP  Jg 


A u(t,  w)  dS(w), 


dB(x;p ) 


By  the  calculation  (4.41)  this  is  equivalent  to 


d 2 
dt2 


u(t,  x;  p)  =  A xu(t,  x;  p), 


Lemma  4.9  then  shows  that 


dz 


dA 


dt 2  dp 2 


u(t,  x;  p)  =  0. 


(4.42) 
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The  initial  conditions  for  u  follow  from  the  initial  conditions  for  u. 


d 

u( 0,  x;  p)  =  g(x\  p),  —u(0,  x;  p)  =  h(x;  p) 

at 


By  (4.37)  we  also  have  a  boundary  condition  at  p  =  0, 


u(t,  x;  0)  =  0. 

Using  Theorem  4.1  and  the  reflection  argument  from  Theorem  4.5,  we  conclude 
that  the  unique  solution  of  (4.42)  under  these  conditions  is  given  by  extending  g  (x ;  p) 
and  h(x\  p)  to  p  e  R  with  odd  symmetry  and  then  using  the  d’Alembert  formula, 

1  1  Cp+t  ~ 

u(t,  x:  p)  =  -  [g(jc;  p  +  t)  +  g(x;  p  -  t)\  +  -  /  h(x:  r)  dr.  (4.43) 

^  ^  Jp-t 

By  (4.37),  we  can  recover  u  from  this  formula  by  setting 

u(t,  x;  p) 

u(t,x)  =  1 1  m - .  (4.44) 

o  p 

To  evaluate  this  limit,  first  note  that  for  o  <p<t  the  odd  symmetry  of  g  and  h  with 
respect  to  p  can  be  used  to  rewrite  (4.43)  as 

1  1  [t+p  ~ 

u(t,  x;p)  =  -  [g(jr;  t+p)-  g(x;  t  -  p)\  +  -  /  h(x:  r)  dr. 

2  2  Jt_p 

The  computations  are  now  straightforward: 

1  r  id 

lin},  t  +  p)~  §(x’  t  ~  p) J  =  0. 

p^o  2p  L  J  at 


and 

1  [t+p  ~ 

lim —  /  h(x;  r)  dr  =  h(x;  t). 

2 p  J t_p 

The  claimed  solution  formula  thus  follows  from  (4.44).  □ 

One  interesting  consequence  of  the  Kirchhoff  formula  is  the  fact  that  three- 
dimensional  wave  propagation  exhibits  a  strict  form  of  the  Huygens’  principle.  The¬ 
orem  4.10  shows  that  the  range  of  influence  of  the  point  (to,  Xo)  is  th t  forward  light 
cone , 

T+(U,  x0)  :=  {(f,  x);  t  >  t0,  |x  -  x0|  =  t  -  Ul  • 


This  matches  the  result  of  Theorem  4.3  for  the  one-dimensional  wave  equation.  The 
strict  Huygens  phenomenon  is  readily  observable  for  acoustic  waves,  in  the  fact  that 
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Fig.  4.13  The  plot  on  the  right  shows  the  observed  waveform  at  a  distance  3  from  the  origin,  caused 
by  the  initial  radial  impulse  shown  on  the  left 


a  sudden  sound  like  a  clap  propagates  as  a  sharp  wavefront  that  is  heard  as  a  single 
discrete  event,  without  aftereffects  unless  there  are  reflective  surfaces  to  cause  an 
echo.  Figure 4. 13  illustrates  this  effect;  an  observer  located  away  from  the  origin 
experiences  a  waveform  of  duration  equal  to  the  diameter  of  the  initial  impulse.  The 
strict  Huygens’  principle  holds  in  every  odd  dimension  greater  than  1,  but  fails  in 
even  dimensions,  as  we  will  illustrate  below. 

The  spherical  averaging  trick  used  for  Theorem  4.10  also  works  in  higher  odd 
dimensions,  although  the  solution  formulas  become  more  complicated.  For  even 
dimensions,  solution  formulas  can  be  derived  from  the  odd-dimensional  case  by  a 
technique  called  the  method  of  descent. 

We  will  work  this  out  for  the  two-dimensional  case.  Suppose  u  e  C2([ 0,  oo)  x  M2) 
solves  the  wave  equation  with  initial  conditions 


u\t= o  =  g. 


du 

~dt 


=  h, 

t= o 


with  g,  h  functions  on  Mr .  If  we  extend  g  and  h  to  M3  as  functions  that  are  independent 
of  *3,  then  Kirchhoff’s  formula  gives  a  solution  to  the  three  dimensional  problem. 
Since  this  solution  is  also  independent  of  X3,  it  “descends”  to  a  solution  in  M2.  The 
resulting  formula  was  first  worked  out  by  Simeon  Poisson  in  the  early  19th  century 
(well  before  Kirchhoff’s  three-dimensional  formula). 

Corollary  4.11  (Poisson’s  integral  formula)  For  u  e  C2([0,  00)  x  M2),  suppose  that 


d2u 

~dF 


—  Au  =  0 


under  the  initial  conditions 


u\t= 0  =  g, 


du 

—  =h. 

at  t= 0 


Then 
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u(t,  X ) 


5 


J_  f  g(*  -  ty) 
2tt  Jo  J~\  -  jyp 


/7(X  -  7j>) 

yt  -  iji2 


Proof  Following  the  procedure  described  above,  we  extend  g  and  h  to  functions  on 
R3  independent  of  X3.  In  this  case  the  integral  (4.36)  becomes 


<?(*;  P)  =  7-  /  gfe  +  pyu*2  +  pyi)dS(y)  (4.45) 

47 r  y§2 

for  x  G  M2.  By  symmetry  we  can  restrict  our  attention  to  the  upper  hemisphere, 
parametrized  in  polar  coordinates  by 


y  =  (V  cos  6 ,  r  sin  6 ,  \/ 1  —  r2^ 


The  surface  area  element  is 


dS  = 


\J\  —  r2 


dr  dO , 


so  that  (4.45)  becomes 


g(x;  p)  = 


P 


27r  /»r 


/’Z7T  /» 

Jo  Jo 


g(xi  +  pr  cos  6 ,  X2  +  pr  sin  0) 


27T  jo  j o 

-f 


V 1  —  r: 


r  dr  dO 


g(x+py)  2 
,  dzy. 

y  i  -  iji2 


The  claimed  two-dimensional  solution  follows  by  substituting  this  formula  for  g  and 
the  corresponding  result  for  h  in  the  Kirchhoff  formula  from  Theorem  4.10.  □ 

Corollary  4.11  shows  that  the  range  of  influence  of  (to ,  xo)  for  the  two-dimensional 
wave  equation  is  the  solid  region  bounded  by  the  forward  light  cone  r+(to,  Xo),  not 
just  the  surface.  Thus,  in  Mr  the  wave  caused  by  a  sudden  disturbance  has  a  lingering 
“tail”  after  the  initial  wavefront  has  passed,  as  illustrated  in  Fig.  4. 14. 


Fig.  4.14  Two-dimensional 
waveform  observed  at  a 
distance  3  from  the  origin, 
corresponding  to  the  radial 
impulse  shown  on  the  left  in 
Fig.  4. 13 
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In  all  dimensions,  solutions  of  the  wave  equation  exhibit  a  phenomenon  known 
as  finite  propagation  speed .  If  the  constant  c  is  reinstated  as  in  (4.30),  then  the  range 
of  influence  is  restricted  to  spacetime  points  that  are  reachable  at  a  speed  less  than 
or  equal  to  c. 


4.7  Energy  and  Uniqueness 


In  Theorem  4.1,  uniqueness  of  solutions  was  a  consequence  of  the  method  of  char¬ 
acteristics.  In  this  section  we  will  present  an  alternative  approach,  which  allows  us 
to  deduce  uniqueness  directly  from  the  equation  without  requiring  any  knowledge 
of  the  solution  other  than  differentiability.  This  argument  is  based  on  the  concept  of 
energy  of  a  solution,  which  proves  to  be  a  powerful  tool  for  analyzing  many  different 
types  of  PDE. 

To  motivate  the  definition,  let  us  specialize  again  to  the  case  of  a  string  of  length 
i  with  fixed  ends.  Assume  that  u  e  C2([ 0,  oo]  x  [0,  £])  satisfies  the  string  wave 
equation  (4.4)  with  Dirichlet  boundary  conditions.  In  the  discrete  model  of  the  string 
used  for  the  derivation  of  the  equation,  the  segment  of  length  Ax  located  at  x  7  had 
mass  pAx  and  velocity  (j^  (xj).  By  the  standard  expression  for  the  kinetic  energy  of  a 
moving  particle,  ^(mass)  x  (velocity)2,  the  kinetic  energy  of  this  segment  is  therefore 


\pAx 


Summing  over  the  segments  and  passing  the  continuum  limit  gives  a  formula  for  the 
total  kinetic  energy  of  the  string: 


dx. 


The  potential  energy  of  the  solution  can  be  calculated  as  the  energy  required  to 
move  the  string  from  zero  displacement  into  the  configuration  described  by  u  (t,  •) .  Let 
us  represent  this  process  by  scaling  the  displacement  to  suit,  Ofbo  e  [0,  1].  By  (4.1) 
the  opposing  force  generated  by  the  tension  also  scales  proportionally  to  s .  The  work 
required  to  shift  the  segment  at  Xj  from  s  to  s  +  As  is  therefore  sAF(t,  Xj)u(t,  Xj)As. 
The  potential  energy  associated  with  this  segment  is 


L 


A£p(t,Xj):=—  /  sAF(t,Xj)u(t,Xj)ds 


i 

-u(t,Xj)AF(t,Xj) 

T  d2u 

—u(t,Xj)  —  (t,Xj)Ax, 


68 


4  The  Wave  Equation 


with  a  minus  sign  because  the  displacement  and  force  are  in  opposing  directions. 

Summing  over  the  segments  and  taking  the  continuum  limit  gives  the  total  poten¬ 
tial  energy, 


£pif) 


d2u 
dx 2 


dx. 


For  comparison  to  the  kinetic  term,  it  is  convenient  to  integrate  by  parts  and  rewrite 
this  in  the  form 


£p(t)  = 


dx. 


The  total  energy  of  the  one-dimensional  string  at  time  t  is  given  by 


£  —  £k  +  £p- 


For  the  higher-dimensional  wave  equation  (4.30)  on  a  domain  £2  C  the  cor¬ 
responding  definition  is 


£[u](t) 


dnx 


(4.46) 


This  is  well-defined  for  u  e  C2([ 0,  oo)  x  £2),  provided  £2  is  bounded. 

Theorem  4.12  Suppose  £2  C  R”  is  a  bounded  domain  with  piecewise  C 1  boundary. 
Ifu  e  C2([0,  oo)  x  §2)  is  a  solution  of  (4.30)  with  u\qq  =  0,  then  the  energy  £[u] 
defined  by  (4.46)  is  independent  oft. 

Proof  The  assumptions  on  u  justify  differentiating  under  the  integral,  so  that 


f£[u]  =  f 

dt  Jn 


du  d 2 


u 


dt  dt 2 


+  c2V 


du 

~dt 


•  Vu 


dnx 


Under  the  condition  u  \qq  =  0,  Green’s  first  identity  (Theorem  2.10)  applies  to  the 
second  term  to  give 


Thus 


(4.47) 


and  (4.30)  implies  that  £  is  constant.  □ 

Corollary  4.13  Suppose  £2  C  R”  is  a  bounded  domain  with  piecewise  C1  boundary. 
A  solution  u  e  C2(M+  x  £2)  of  the  equation 
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d2u 

' di 2 


C2Au  =  f,  u\dn=0, 
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u\t=0  =  g, 


du 

~dt 


=  h, 

t= o 


is  uniquely  determined  by  the  functions  /,  g,  h. 

Proof  If  u\  and  U2  are  solutions  of  the  equation  with  the  same  initial  conditions,  then 
w  :=  u\  —  U2  satisfies  (4.30)  with  the  initial  conditions 


dw 

w(0,  x)  =  0,  — — (0,  x)  =  0. 

ot 

At  time  t  =  0  this  gives  S[w]  =  0,  and  Theorem  4.12  then  implies  that  S[w]  =  0 
for  all  t.  Since  the  terms  in  the  integrand  of  S[w]  are  non-negative,  they  must  each 
vanish.  This  shows  that  w  is  constant,  and  hence  w  =  0  by  the  initial  conditions. 
Therefore  u\  =  U2.  □ 


4.8  Exercises 


4.1  Suppose  u(t,  x)  is  a  solution  of  the  wave  equation  (4.5)  for  v  e  R.  Let  V  be  a 
parallelogram  in  the  (t,  x )  plane  whose  sides  are  characteristic  lines.  Show  that  the 
value  of  u  at  each  vertex  of  V  is  determined  by  the  values  at  the  other  three  vertices. 


4.2  The  wave  equation  (4.5)  is  an  appropriate  model  for  the  longitudinal  vibrations 
of  a  spring.  In  this  application  u(t,  x)  represents  displacement  parallel  to  the  spring. 
Suppose  that  spring  has  length  i  and  is  free  at  the  ends.  This  corresponds  to  the 
Neumann  boundary  conditions 


du 

dx 


(L  0) 


du 

dx 


=  0, 


for  all  t  >  0. 


Assume  the  initial  conditions  are  g  and  h  as  in  (4.16),  which  also  satisfy  Neumann 
boundary  conditions  on  [0,  i\.  Determine  the  appropriate  extensions  of  g  and  h 
from  [0,  l]  to  R  so  that  the  solution  u(t,  x)  given  by  (4.8)  will  satisfy  the  Neumann 
boundary  problem  for  all  t. 


4.3  In  the  derivation  in  Sect.  4.1,  suppose  we  include  the  effect  of  gravity  by  adding 
a  term  —pgAx  to  the  discrete  equation  of  motion  (4.2),  where  g  >  0  is  the  constant 
of  gravitational  acceleration.  The  wave  equation  is  then  modified  to 

d2u  2  d2u 
dt2  dx2 


=  -g- 


(4.48) 
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Assume  that  v  e  [0,  £],  with  u  satisfying  Dirichlet  boundary  conditions  at  the  end¬ 
points. 

(a)  Find  an  equilibrium  solution  uq(x)  for  (4.48),  that  satisfies  the  boundary  condi¬ 
tions  but  does  not  depend  on  time. 

(b)  Show  that  if  u  i  is  a  solution  of  the  original  wave  equation  (4.5),  also  with  Dirichlet 
boundary  conditions,  then  u  =  uo  +  u\  solves  (4.48). 

(c)  Given  the  initial  conditions  u(x,  0)  =  0,  |^(x,  0)  =  0,  find  the  corresponding 
initial  conditions  for  u\.  Then  apply  Theorem  4.5  to  find  u\  and  hence  solve 
for  u. 

4.4  In  Example  4.8,  let  the  forcing  term  be 

f(t,  x)  =  cos(c ot)  sin(^v), 


with  uj  >  0  and 


Uk  := 


kir 

T' 


Find  the  solution  u(t,x )  given  initial  conditions  g  =  h  =  0.  Include  both  cases 

u  7^  ujk  and  u  =  ujk- 


4.5  The  telegraph  equation  is  a  variant  of  the  wave  equation  that  describes  the 
propagation  of  electrical  signals  in  a  one-dimensional  cable: 


d2u 
~dt 2 


du 


+  bu  —  c2 


d2u 
dx 2 


where  u(t,x )  is  the  line  voltage,  c  is  the  propagation  speed,  and  a,b  >  0  are 
determined  by  electrical  properties  of  the  cable  (resistance,  inductance,  etc.).  Show 
that  the  substitution 

u(t ,  x)  =  e~at^2w(t,  x) 


reduces  the  telegraph  equation  to  an  ordinary  wave  equation  for  w ,  provided  a  and 
b  satisfy  a  certain  condition.  Find  the  general  solution  in  this  case.  (This  result  has 
important  practical  applications,  in  that  the  electrical  properties  of  long  cables  can 
be  “tuned”  to  eliminate  distortion.) 


4.6  An  alternative  approach  to  the  one-dimensional  wave  equation  is  to  recast  the 
PDE  as  a  pair  of  ODE.  Consider  the  wave  equation  with  forcing  term, 

d2U  ry  d2U 

- c 2 - =  f 

dt 2  dx 2  J • 

(a)  Define  a  vector- valued  function  v  =  (tq,  t^)  with  components 
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Vi  := 


du 

dt' 


V2  := 


du 

dx 


Show  that  v  satisfies  a  vector  equation 


dv  dv 

—  A  •  —  =  b. 


dt 


dx 


where  b  :=  (/,  0)  and  A  is  the  matrix 


(4.49) 


(b)  The  vector  equation  (4.49)  can  be  solved  by  diagonalizing  A.  Check  that  if  we 
set 

1  c 


T  := 


i  -c  r 


then 


T  AT~l  = 


c  0 
0  — c 


Then  show  under  that  the  substitution 


w  :=  Tv, 


(4.49)  reduces  to  a  pair  of  linear  conservation  equations  for  the  components  of 
w : 


dw\  dw  1  r 

dt  dx  -f’ 

dw2  i  rdw2  _  f 

dt  ^  L  dx  ~  J ' 


(4.50) 


(c)  Translate  the  initial  conditions 

u(0,x)  =  g(x), 

into  initial  conditions  for  w\  and  W2,  and  then  solve  (4.50)  using  the  method  of 
characteristics. 

(d)  Combine  the  solutions  for  w\  and  u>2  to  compute  v\  =  du  I  dt,  and  then  integrate 
to  solve  for  u.  Your  answer  should  be  a  combination  of  of  the  d’Alembert  formula 
(4.8)  and  the  Duhamel  formula  (4.22). 


du 

~dt 


(0,  x)  =  h(x). 


4.7  The  evolution  of  a  quantum-mechanical  wave  function  u(t,  x)  is  governed  by 
the  Schrodinger  equation : 
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du 

~dt 


—  i  Au  =  0 


(4.51) 


(ignoring  the  physical  constants).  Suppose  that  u(t,x )  is  a  solution  of  (4.51)  for 
t  £  [0,  oo)  and  x  eW1,  with  initial  condition 


u(0,x)  =  g(x). 


Assume  that 


\g\2  dnx  <  oo. 


(a)  Show  that  for  all  t  >  0, 


(In  quantum  mechanics  |  u  |2  is  interpreted  as  a  probability  density,  so  this  identity 
is  conservation  of  total  probability.) 

(b)  Show  that  a  solution  of  Schrodinger’s  equation  is  uniquely  determined  by  the 
initial  condition  g. 


4.8  In  W1  consider  the  wave  equation 


d2u 

~dG 


—  c2  Au  =  0. 


The  plane  wave  solutions  have  the  form 


u(t,  x)  —  e 


i(k-x—ut ) 


(4.52) 


(4.53) 


where  wet  and  k  e  W1  are  constants. 

(a)  Find  the  condition  on  uj  =  uj(k)  for  which  u  solves  (4.52). 

(b)  For  fixed  t,  6  e  M,  show  that  {x  e  W1;  u(t,  x)  =  el6}  is  a  set  of  planes  per¬ 
pendicular  to  k.  Show  that  these  planes  propagate,  as  t  increases,  in  a  direction 
parallel  to  k  with  speed  given  by  c.  (Hence  the  term  “plane”  wave.) 


4.9  The  Klein- Gordon  equation  in  W1  is  a  variant  of  the  wave  equation  that  appears 
in  relativistic  quantum  mechanics, 


d2u 

~dG 


A u  +  m  u  —  0, 


(4.54) 


where  m  is  the  mass  of  a  particle. 
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(a)  Find  a  formula  for  cj  =  uo(k,m)  under  which  this  equation  admits  plane  wave 
solutions  of  the  form  (4.53). 

(b)  Show  that  we  can  define  a  conserved  energy  £  for  this  equation  by  adding  a  term 
proportional  to  u 2  to  the  integrand  in  (4.46). 


Chapter  5 

Separation  of  Variables 


Some  PDE  can  be  split  into  pieces 
equation 

du 
~dt  " 


that  involve  distinct  variables.  For  example,  the 


a(t)b(x)Au  =  0 


could  be  written  as 


1  du 
a(t)  dt 


b(x)Au, 


provided  a(t )  ^  0.  This  puts  all  of  the  t  derivatives  and  t  -dependent  coefficients  on 
the  left  and  all  of  the  terms  involving  x  on  the  right. 

Splitting  an  equation  this  way  is  called  separation  of  variables.  For  PDE  that 
admit  separation,  it  is  natural  to  look  for  product  solutions  whose  factors  depend  on 
the  separate  variables,  e.g.,  u(t,  x)  =  v(t)<f(x).  The  full  PDE  then  reduces  to  a  pair 
of  equations  for  the  factors.  In  some  cases,  one  or  both  of  the  reduced  equations  is 
an  ODE  that  can  be  solved  explicitly. 

This  idea  is  most  commonly  applied  to  evolution  equations  such  as  the  heat  or 
wave  equations.  The  classical  versions  of  these  PDE  have  constant  coefficients,  and 
separation  of  variables  can  thus  be  used  to  split  the  time  variable  from  the  spatial 
variables.  This  reduces  the  evolution  equation  to  a  simple  temporal  ODE  and  a  spatial 
PDE  problem. 

Separation  among  the  spatial  variables  is  sometimes  possible  as  well,  but  this 
requires  symmetry  in  the  equation  that  is  also  shared  by  the  domain.  For  example, 
we  can  separate  variables  for  the  Laplacian  on  rectangular  or  circular  domains  in  M2 . 
But  if  the  domain  is  irregular  or  the  differential  operator  has  variable  coefficients, 
then  separation  is  generally  not  possible. 

Despite  these  limitations,  separation  of  variables  plays  a  significant  role  the  devel¬ 
opment  of  PDE  theory.  Explicit  solutions  can  still  yield  valuable  information  even  if 
they  are  very  special  cases. 


©  Springer  International  Publishing  AG  2016 
D.  Borthwick,  Introduction  to  Partial  Differential  Equations , 
Universitext,  DOI  10. 1007/978-3-3 19-48936-0_5 


75 


76 


5  Separation  of  Variables 


Fig.  5.1  Frequency  decomposition  for  the  sound  of  a  violin  string 


5.1  Model  Problem:  Overtones 

In  1636  the  mathematician  Marin  Mersenne  published  his  observation  that  a  vibrat¬ 
ing  string  produces  multiple  pitches  simultaneously.  The  most  audible  pitch  corre¬ 
sponds  to  the  lowest  frequency  of  vibration,  called  the  fundamental  tone  of  the  string. 
Mersenne  also  detected  higher  pitches,  at  integer  multiples  of  the  fundamental  fre¬ 
quency.  (The  relationship  between  frequency  and  pitch  is  logarithmic;  doubling  the 
frequency  raises  the  pitch  by  one  octave.) 

The  higher  multiples  of  the  fundamental  frequency  are  called  overtones  of  the 
string.  Figure  5.1  shows  the  frequency  decomposition  for  a  sound  sample  of  a  bowed 
violin  string,  with  a  fundamental  frequency  of 440  Hz.  The  overtones  appear  as  peaks 
in  the  intensity  plot  at  multiples  of  440. 

At  the  time  of  Mersenne’s  observations,  there  was  no  theoretical  model  for  string 
vibration  that  would  explain  the  overtones.  The  wave  equation  that  d’Alembert  sub¬ 
sequently  developed  (a  century  later)  gave  the  first  theoretical  justification.  However, 
this  connection  is  not  apparent  in  the  explicit  solution  formula  developed  in  Sect.  4.3. 
To  understand  how  the  overtones  are  predicted  by  the  wave  equation,  we  need  to 
organize  the  solutions  in  terms  of  frequency. 


5.2  Helmholtz  Equation 

The  classical  evolution  equations  on  have  the  form 

Ptu  —  A  u  =  0,  (5.1) 

where  Pt  is  a  first-  or  second-order  differential  operator  involving  only  the  time 
variable.  Examples  include  the  wave  equation  (Pt  =  d2 /dt2),  heat  equation  (Pt  = 
d/dt ),  and  Schrodinger  equation  ( Pt  =  — id/dt ). 


5.2  Helmholtz  Equation 
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Lemma  5.1  Ifu  is  a  classical  solution  of  (5.1)  of  the  form 

u(t,  x )  =  v(t)cj)(x ), 

fort  G  R  and  x  G  Q  C  R”,  m  any  region  where  u  is  nonzero  there  is  a  constant 
n  such  that  the  components  solve  the  equations 


PtV  =  KV,  A  (j)  —  K(j). 


(5.2) 


Proof  Substituting  u  =  vf  into  (5.1)  gives 

(j)Pt  v  —  vA  f  =  0. 

Assuming  that  u  is  nonzero,  we  can  divide  by  u  to  obtain 

1  1  , 

-Ptv  =  -Af. 
v  <fi 

The  left  hand-side  is  independent  of  x  and  the  right  is  independent  of  t.  We  conclude 
that  both  sides  must  be  equal  to  some  constant  n.  □ 

The  two  differential  equations  in  (5.2)  are  analogous  to  eigenvalue  equations  from 
linear  algebra,  with  the  role  of  the  linear  operator  or  matrix  taken  by  the  differential 
operators  Pt  or  A. 

Let  us  first  focus  on  the  spatial  problem,  which  is  usually  written  in  the  form 

—  Af  =  Xf.  (5.3) 

This  is  called  the  Helmholtz  equation ,  after  the  19th  century  physicist  Hermann  von 
Helmholtz.  The  minus  sign  is  included  so  that  A  >  0  for  the  most  common  types  of 
boundary  conditions.  Adapting  the  linear  algebra  terminology,  we  refer  to  the  number 
A  in  (5.3)  as  an  eigenvalue  and  the  corresponding  solution  f>  as  an  eigenfunction . 
The  Helmholtz  equation  is  sometimes  called  the  Laplacian  eigenvalue  equation. 

We  will  present  a  general  analysis  of  the  Helmholtz  problem  on  any  bounded 
domain  in  in  Chap.  1 1 ,  and  later  in  this  chapter  we  will  consider  some  two-  or  three- 
dimensional  cases  for  which  further  spatial  separation  is  possible.  For  the  remainder 
of  this  section  we  restrict  our  attention  to  problems  in  one  spatial  dimensional,  for 
which  (5.3)  is  an  ODE. 

Theorem  5.2  For  <fi  €  C2[0,  l]  the  equation 

-  S = m = ^(£) = °’  (5-4) 


has  nonzero  solutions  only  if 
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for  n  G  N.  Up  to  a  constant  multiple,  the  corresponding  solutions  are 

<Pn(x)  :=  sinCvV^)- 


(5.5) 


Proof  Note  that  (5.4)  implies 

U  U  rPcb 

A  /  |0| 2  dx  =  —  I  — -  fdx. 

Jo  Jo  dx 

Using  the  Dirichlet  boundary  conditions  we  can  integrate  by  parts  on  the  right  without 
any  boundary  term,  yielding 


A 


(5.6) 


Assuming  that  0  is  not  identically  zero,  this  shows  that  A  >  0.  Furthermore,  A  =  0 
implies  df/dx  =  0,  which  gives  a  constant  solution.  The  only  constant  solution  is 
the  trivial  case  0  =  0,  because  of  the  boundary  conditions. 

It  therefore  suffices  to  consider  the  case  A  >  0,  for  which  the  ODE  in  (5.4)  reduces 
to  the  harmonic  oscillator  equation,  with  independent  solutions  given  by  sin(\/Av) 
and  cos(\/Av).  Only  sine  satisfies  the  condition  0(0)  =  0,  so  the  possible  solutions 
have  the  form 

0(v)  =  sin(VAv). 


To  satisfy  the  condition  0(f)  =  0  we  must  have 

sin(v/A  i)  =  0. 

For  a  nonzero  solution  this  imposes  the  restriction  that  VA i  G  7tN,  which  gives  the 
claimed  set  of  solutions.  □ 

Some  of  the  eigenfunctions  obtained  in  Theorem  5.2  are  illustrated  in  Fig.  5.2.  For 
the  sake  of  application  to  our  original  string  model,  let  us  reinstate  the  propagation 
speed  c  :=  +JT Ip  and  write  the  string  equation  as 

Q2  ^ 

— -  —  c2 Au  ~  0,  u(t ,  0)  =  u( 0,  i)  ~  0.  (5.7) 


Fig.  5.2  The  first  four 
eigenfunctions  for  a 
vibrating  string  with  fixed 
ends 
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With  the  spatial  solution  given  by  the  eigenfunction  associated  to  Xn,  the  correspond¬ 
ing  temporal  eigenvalue  equation  is  also  a  harmonic  oscillator  ODE, 


d2v 
dt 2 


The  solutions  could  be  written  in  terms  of  sines  and  cosines,  but  for  the  temporal 
component  it  is  usually  more  convenient  to  use  the  complex  exponential  form.  The 
general  complex-valued  solution  is 


vn(t)  =  aneluJnt 


+  bne  iuJnt 


with  an ,  bn  eC  and 


^ n 


(5.8) 


For  real- valued  solutions,  the  coefficients  are  restricted  by  bn  =  cTn. 

Combining  the  temporal  and  spatial  components  gives  a  set  of  solutions  for  the 
vibrating  string  problem: 


un(t,x) 


&n  e 


iujnt 


+  bne 


(5.9) 


for  n  E  N. 

The  functions  (5.9)  are  referred  to  as  “pure-tone”  solutions,  because  they  model 
oscillation  at  a  single  frequency  uon .  In  the  case  of  visible  light  waves,  the  frequency 
corresponds  directly  to  color.  For  this  reason  the  set  of  frequencies  {u)n}  is  called 
the  spectrum.  By  association,  the  term  spectrum  is  also  used  for  sets  of  eigenvalues 
appearing  in  more  general  problems.  For  example,  the  set  { An }  of  eigenvalues  for 
which  the  Helmholtz  problem  has  a  nontrivial  solution  is  called  the  spectrum  of  the 
Faplacian,  even  though  Xn  is  proportional  to  the  square  of  the  frequency  uon . 

From  (5.8)  we  can  deduce  the  fundamental  tone  of  the  string,  as  predicted  by 
d’Alembert’s  wave  equation  model.  To  convert  frequency  to  the  standard  unit  of  Hz 
(cycles  per  second),  we  divide  u\  by  2ir  to  obtain  the  formula 


(5.10) 


This  is  known  as  Mersenne’s  law ,  published  in  1637. 

The  wave  equation  model  also  predicts  the  higher  frequencies  uon  =  nuj\ ,  cor¬ 
responding  to  the  sequence  of  overtones  noted  illustrated  in  Fig.  5.1.  The  fact  that 
each  overtone  is  associated  with  a  particular  spatial  eigenfunction  is  significant.  The 
waveforms  for  higher  overtones  have  nodes ,  meaning  points  where  the  string  is  sta¬ 
tionary.  As  we  can  see  in  Fig.  5.2,  the  nodes  associated  to  the  frequency  c on  subdivide 
the  string  into  n  equal  segments.  Touching  the  string  lightly  at  one  of  these  nodes 
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will  knock  out  the  lower  frequencies,  a  practice  string  players  refer  to  as  playing  a 
“harmonic”. 

As  this  discussion  illustrates,  the  “spectral  analysis”  of  the  wave  equation  is  more 
directly  connected  to  experimental  observation  than  the  explicit  solution  formula 
(4.8).  The  displacement  of  a  vibrating  string  is  technically  difficult  to  observe  directly 
because  the  motion  is  both  rapid  and  of  small  amplitude.  Such  observations  were 
first  achieved  by  Hermann  von  Helmholtz  in  the  mid- 19th  century. 


Example  5.3  The  one-dimensional  wave  equation  can  be  used  to  model  for  the  fluc¬ 
tuations  of  air  pressure  inside  a  clarinet.  The  interior  of  a  clarinet  is  essentially  a 
cylindrical  column,  and  for  simplicity  we  can  assume  that  the  pressure  is  constant 
on  cross-sections  of  the  cylinder,  so  that  the  variations  in  pressure  are  described  by 
a  function  u(t,  x)  with  v  e  [0,  f],  where  i  is  the  length  of  the  instrument.  Pressure 
fluctuations  are  measured  relative  to  the  fixed  atmospheric  background,  with  u  =  0 
for  atmospheric  pressure. 

The  maximum  pressure  fluctuation  occurs  at  the  mouthpiece  at  v  =  0,  where  a 
reed  vibrates  as  the  player  blows  air  into  the  instrument.  Since  a  local  maximum 
of  the  pressure  corresponds  to  a  critical  point  of  u(t,  •),  the  appropriate  boundary 
condition  is 


du 

dx 


(t,  0)  =  0. 


(5.11) 


At  the  opposite  end  the  air  column  is  open  to  the  atmosphere,  so  the  pressure  does 
not  fluctuate, 

u(t,i)=  0.  (5.12) 


The  evolution  of  u  as  a  function  of  t  is  governed  by  the  wave  equation  (4.5),  with 
c  equal  to  the  speed  of  sound.  The  corresponding  Helmholtz  problem  is 


d2cj) 
dx 2 


=  X(j),  (j)\ 0)  =  0,  0(f)  =  0. 


(5.13) 


The  boundary  condition  at  v  =  0  implies  that 

cj)(x)  =  cos(v/Av), 


and  the  condition  at  x  =  i  then  requires 


cos(VAf)  =  0. 


This  means  that  the  eigenvalues  are  given  by 


7 T 


K  -  (n  2)  ’ 


for  n  e  N.  Some  of  the  resulting  eigenfunctions  are  shown  in  Fig.  5.3. 
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Fig.  5.3  The  first  four  eigenfunctions  for  pressure  fluctuations  in  a  clarinet 


Fig.  5.4  Clarinet  frequency  decomposition 


The  corresponding  oscillation  frequencies  are  given  by 


CJn 


C7T 


1 

2 


)• 


In  contrast  to  the  string,  the  model  predicts  that  the  clarinet’s  spectrum  will  contain 
only  odd  multiples  of  the  fundamental  frequency  .  Figure  5.4  shows  the  frequency 
decomposition  for  a  clarinet  sound  sample.  The  prediction  holds  true  for  the  first  few 
modes,  but  the  simple  model  appears  to  break  down  at  higher  frequencies.  0 


5.3  Circular  Symmetry 

In  dimension  greater  than  one,  spatial  separation  of  variables  is  essentially  the  only 
way  to  compute  explicit  solutions  of  the  Helmholtz  equation  (5.3),  and  this  only 
works  for  very  special  cases.  The  most  straightforward  example  is  a  rectangular 
domain  in  Mw,  which  we  will  discuss  in  the  exercises. 

In  this  section  we  consider  the  simplest  non-rectangular  case,  based  on  polar 
coordinates  (r,  0)  in  M2.  Separation  in  polar  coordinates  allows  us  to  compute  eigen¬ 
functions  and  eigenvalues  on  a  disk  in  M2,  for  example. 

With  x  =  (vi,  X2)  in  M2,  polar  coordinates  are  defined  by 


(x\ ,  X2)  =  (r  cos  0,  r  sin  0) . 
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The  polar  form  of  the  Laplacian  is  computed  by  writing 


d 2  d 2 

dx\  dx\ 

and  then  converting  the  partials  with  respect  to  x\  and  *2  into  r  and  9  derivatives 
using  the  chain  rule.  The  result  is 


1  d 

r  dr 


1  d 2 
r 2  dO2 


(5.14) 


Note  that  there  are  no  mixed  partials  involving  both  r  and  6,  and  that  the  coefficients 
do  not  depend  on  0.  This  allows  separation  of  r  and  6,  provided  the  domain  is  defined 
by  specifying  ranges  of  r  and  9. 

To  solve  the  radial  eigenvalue  equation,  we  will  use  Bessel  functions,  named  for 
the  astronomer  Friedrich  Bessel.  Bessel’s  equation  is  the  ODE: 


z2  f" (z)  +  zf'(z)  +  (z2  -  k2)f(z)  =  0,  (5.15) 


with  k  e  C  in  general.  For  our  application  k  will  be  an  integer.  The  standard  pair  of 
linearly  independent  solutions  is  given  by  the  Bessel  functions  Jk(z)  and  F&(z). 

The  Bessel  J-functions,  a  few  of  which  are  pictured  in  Fig.  5.5,  satisfy 

J-kiz)  =  i-lfJkiz),  (5.16) 


for  all  k  e  Z.  Bessel  represented  these  solutions  as  integrals: 


1  r 

Jk(z)  :=  —  /  cos  (z  sin  9  —  k6)  dO. 

7T  Jo 


One  can  also  write  J \  as  a  power  series  k  e  No, 


Fig.  5.5  The  first  four 
Bessel  J-functions 
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-  (i)  s 

1=0 


1 


/!(&  +  /)! 


(5.17) 


Together  with  (5.16),  this  shows  that  *4(z)  ~  c^z1^1  as  z  — >  0  for  any  k  e  Z.  In 
contrast,  the  Bessel  Y-function  satisfies  F^(z)  ~  as  z  — >  0. 

A  change  of  sign  in  (5.15)  gives  the  equation 

z2f\z)  +  zf'iz)  +  (z2  +  k2)f(z)  =  0.  (5.18) 


Its  standard  solutions  are  the  modified  Bessel  functions  h(z)  and  Kk(z ).  As  z  — >  0 
these  satisfy  the  asymptotics  4(z)  ~  c^z^,  as  illustrated  in  Fig. 5.6,  and  Kk(z)  ~ 

C,fcZ_|i|. 

Lemma  5.4  Suppose  <p  e  C2( R2)  /,v  a  solution  of 


—  A  <p  —  Xfi, 


that  factors  as  a  product  h(r)w(0).  Then,  up  to  a  multiplicative  constant,  f  has  the 
form 


<f>\,k(r,  0)  ■■=  hk(r)e 


(5.19) 


for  some  k  e  Z,  with 


hkir)  := 


r\k\ 

•  Jk(VXr), 


h  (s/—Xr ) , 


A  =  0, 
A  >  0, 
A  <  0. 


Proof  Under  the  assumption  f  =  hw,  the  Helmholtz  equation  reduces  by  (5.14)  to 


w  d 
r  dr 


T  A  hw  —  0. 


Fig.  5.6  The  first  four 
modified  Bessel  I-functions 
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5  Separation  of  Variables 


With  some  rearranging,  we  can  separate  the  r  and  0  variables, 


1 

h 


h  +  A  2r2 


1  d2w 
w  dO 2  ’ 


(5.20) 


provided  h  and  w  are  nonzero. 

As  in  Lemma  5.1,  we  conclude  that  both  sides  must  be  equal  to  some  constant  n. 
The  9  equation  is 


d2w 

~d02 


=  KW. 


(5.21) 


The  function  w{9)  is  assumed  to  be  27r-periodic.  By  the  arguments  used  in  Theo¬ 
rem  5.2,  a  nontrivial  solution  is  possible  only  if  k  =  k2  where  k  is  an  integer.  A  full 
set  of  27r-periodic  solutions  of  (5.21)  is  given  by 


w k (9)  :=  elk0 ,  k  e  Z. 


Before  examining  the  radial  equation,  let  us  note  that  the  assumption  that  </>  is 
C2  imposes  a  boundary  condition  at  r  =  0.  To  see  this,  first  note  that  the  function 

r  =  tJx2  +  x\  is  continuous  at  (0,  0)  but  not  differentiable.  For  r  >  0, 

dr  xj 

dxj  r 

which  does  not  have  a  limit  as  r  — >►  0.  On  the  other  hand,  the  functions 

re±l°  =  x\  ±  ix2 


are  C°°.  Similarly,  for  k  e  Z  we  have 


(xi  +  ix2)k, 
(x\  -  ix2)~k, 


k  e  N0, 
—k  g  N. 


(5.22) 


These  functions  are  polynomial  and  hence  C°°.  We  will  see  below  that  the  solutions 
of  the  radial  equation  corresponding  to  k  =  k2  satisfy  h(r)  ~  ar±k  as  r  — >►  0,  for 
some  constant  a.  The  differentiability  of  0  at  the  origin  will  require  the  asymptotic 
condition 


hk(r)  ~  ar^ 


(5.23) 


as  r  0. 

For  Wk(9)  =  elke ,  the  radial  component  of  (5.20)  is 


5.3  Circular  Symmetry 
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The  case  A  =  0  is  relatively  straightforward  to  analyze.  In  this  case  (5.24)  is  homoge¬ 
neous  in  the  r  variable  (meaning  invariant  under  scaling).  Such  equations  are  solved 
by  monomials  of  the  form  /^(r)  =  ra  with  a  e  R.  If  we  substitute  this  guess  into 
(5.24)  with  A  =  0,  the  equation  reduces  to 

a2  —  k2  =  0, 

with  solutions  a  =  ±k.  Since  a  second  order  ODE  has  exactly  two  independent 
solutions,  the  functions  r±k  give  a  full  set  of  solutions  for  k  ^  0.  For  k  =  0  the  two 
possibilities  are  1  and  In  r.  By  the  condition  (5.23),  the  solutions  In  r  and  r-|/v|  must 
be  ruled  out.  The  only  possible  solutions  for  A  =  0  are  thus 

hk(r)  =  rm. 


Note  that  the  resulting  solutions, 

<h,k(r,  6)  :=rweike, 

are  precisely  the  polynomials  (5.22). 

For  A  >  0  (5.24)  can  be  reduced  to  the  Bessel  form  (5.15)  by  the  change  of 
variables  z  =  V\r.  The  possible  solutions  Yk(VXr)  are  ruled  out  because  they 
diverge  at  r  =  0.  On  the  other  hand,  the  power  series  (5.17)  shows  that  the  func¬ 
tion  hk(r)  =  Jk(VXr)  satisfies  the  condition  (5.23).  Thus  for  A  >  0  the  possible 
eigenfunction  with  k  e  Z  is 


4>x,k (r,  0)  :=  Jk(V\r)e,ke. 


We  should  check  that  this  function  is  at  least  C2  at  the  origin.  In  fact,  it  follows  from 
the  power  series  expansion  (5.17)  that  is  C°°  on  M2. 

Similar  considerations  apply  for  A  <  0,  except  that  this  time  the  substitution 
z  =  \f— Ar  reduces  (5.24)  to  (5.18).  The  condition  (5.23)  is  satisfied  only  for  the 
solution  4(V-Ar).  □ 

Example  5.5  The  linear  model  for  the  vibration  of  a  drumhead  is  the  wave  equation 
(4.30).  For  a  circular  drum  we  can  take  the  spatial  domain  to  be  the  unit  disk  D  := 
{r  <  1}  C  M2.  Femma  5.1  reduces  the  problem  of  determining  the  frequencies  of 
the  drum  to  the  Helmholtz  equation, 


-A  </)  =  A  (j),  0 \dB  =  0-  (5.25) 

The  possible  product  solutions  are  given  by  Femma  5.4,  subject  to  the  boundary 
condition  /u(l)  =  0.  This  rules  out  A  <  0,  because  in  that  case  h^ir)  has  no  zeros 
for  r  >  0. 

For  A  >  0,  we  have  hk(r)  =  Jk(VXr),  and  the  boundary  condition  takes  the  form 

■4(VA)  =  0. 
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Table  5.1  Zeros  of  the  Bessel  function  Jk.  For  each  k,  the  spacing  between  zeros  approaches  n  as 
m  — ►  oo 


k 

jk,  1 

jk,  2 

jk,  3 

jk,  4 

0 

2.405 

5.520 

8.654 

11.792 

1 

3.832 

7.016 

10.174 

13.324 

2 

5.136 

8.417 

11.620 

14.796 

3 

6.380 

9.761 

13.015 

16.223 

4 

7.588 

11.065 

14.373 

17.616 

(This  is  analogous  to  the  condition  sin(\/A£)  =  0  from  the  one-dimensional  string 
problem.)  Although  Jk  is  not  a  periodic  function,  it  does  have  an  infinite  sequence 
of  positive  zeros  with  roughly  evenly  spacing.  It  is  customary  to  write  these  zeros  in 
increasing  order  as 

o  <  jk.\  <  jk, 2  <  .... 


By  the  symmetry  (5.16), 

J—k,m  =  Jk,m • 


Table  5.1  lists  some  of  these  zeros. 

Restricting  y/\  to  the  set  of  Bessel  zeros  gives  the  set  of  eigenvalues 

X  —  ;2 

^k,m  ~  Jk,m ’ 

indexed  by  k  e  Z,  m  e  N.  The  corresponding  eigenfunctions  are 

4>k,m(r,  0)  :=  Jk(jk,mr)e,ke .  (5.26) 

The  first  set  of  these  are  illustrated  in  Fig.  5.7. 

The  collection  of  functions  (5.26)  yields  a  complete  list  of  eigenfunctions  and 
eigenvalues  for  ED,  although  that  is  not  something  we  can  prove  here.  0 

The  eigenvalues  calculated  in  Example  5.5  correspond  to  vibrational  frequencies 


^ k,m  • —  CJk,rrii 


for  k  e  Z  and  me  N.  The  propagation  speed  c  depends  on  physical  properties 
such  as  tension  and  density.  The  relative  size  of  the  frequencies  helps  to  explain 
the  lack  of  definite  pitch  in  the  sound  of  a  drum.  The  ratios  of  overtones  above  the 
fundamental  cjo,i  are  shown  in  Table 5.2.  In  contrast  to  the  vibrating  string  case, 
where  the  corresponding  ratios  were  integers  1,  2,  3,  . . .,  or  the  clarinet  model  of 
Example  5.3  with  ratios  1,  3,  5, . . .,  the  frequencies  of  the  drum  are  closely  spaced 
with  no  evident  pattern. 


5.4  Spherical  Symmetry 
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Fig.  5.7  Contour  plots  of  the  spatial  component  of  the  eigenfunctions  of  ID 


Table  5.2  Frequency  ratios 
for  a  circular  drumhead 


k 

m 

0 

1 

1 

1 

1 

1.593 

2 

1 

2.136 

0 

2 

2.295 

3 

1 

2.653 

1 

2 

2.917 

4 

1 

3.155 

5.4  Spherical  Symmetry 

Another  special  case  that  allows  separation  of  spatial  variables  is  spherical  symmetry 
in  M3.  Spherical  coordinates  (r,  cp,  6)  are  defined  through  the  relation 

(x\,  X2,  JC3)  =  (r  sin  cpcos  6 ,  r  sin  cp  sin  6,  r  cos  (p ). 
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Note  that  6  is  the  azimuthal  angle  here  and  p  the  polar  angle,  consistent  with  the 
notation  from  Sect.  5.3.  (This  convention  is  standard  in  mathematics;  in  physics  the 
roles  are  often  reversed.) 

As  in  the  circular  case,  we  can  use  the  chain  rule  to  translate  the  three-dimensional 
Laplacian  into  spherical  variables: 


1  d 

r 2  dr 


1  d 

r 2  sin  p  dp 


(5.27) 


It  is  not  immediately  clear  that  this  operator  admits  separation,  because  the  coef¬ 
ficients  depend  on  both  r  and  p.  Note,  however,  that  we  can  factor  r~ 2  out  of  the 
angular  derivative  terms,  to  write  (5.27)  as 


where 


1  9  / J\  1 

A  "  Cd?  V  dr)  +  ^A§2’ 

Id/.  d\  Id2 

sin  ip  dip  \  ^ dp)  ^  sin2  p  dd 2 


(5.28) 


(5.29) 


Here  §2  stands  for  the  unit  sphere  {r  =  1}  C  M3,  and  A§2  is  called  the  spherical 
Laplacian. 

The  expression  (5.29)  may  look  awkward  at  first  glance,  but  A§2  is  a  very  natural 
operator  geometrically.  From  the  fact  that  A  is  invariant  under  rotations  of  M3  about 
the  origin,  we  can  deduce  that  A§2  is  also  invariant  under  rotations  of  the  sphere.  It  is 
possible  to  show  that  A§2  is  the  only  second-order  operator  with  this  property,  up  to 
a  multiplicative  constant.  The  operator  A§2  is  thus  as  symmetric  as  possible,  and  the 
reason  that  (5.29)  looks  so  complicated  is  that  the  standard  coordinate  system  (0,  p) 
does  not  reflect  the  full  symmetry  of  the  sphere. 

We  will  discuss  the  radial  component  of  (5.28)  in  an  example  below.  For  now  let 
us  focus  on  the  Helmholtz  problem  on  the  sphere,  which  allows  further  separation 
of  the  0  and  p  variables. 

The  classical  ODE  that  arises  from  separation  of  the  angle  variables  is  the  asso¬ 
ciated  Legendre  equation : 


(1  -  zz)/" (z)  ~  2zf\z)  +  I  viy  +  1)  - 


(■ 


/T 


1  —  z: 


) 


m  =  o, 


(5.30) 


with  parameters  p,  v  e  C.  A  pair  of  linearly  independent  solutions  is  given  by  the 
Legendre  functions  Pf(z)  and  Q%(z). 

In  the  special  case  where  v  is  replaced  by  /  g  No  and  p  by  a  number  m  e 
respectively,  the  Legendre  P-functions  are  given  by  a  relatively  simple 

formula: 


-in 


(-1)”’  ,  dl+'  , 

^ — !—n  -  Z2W2 - (z2 

2  n\  dzl+m 


2  n\ 


(5.31) 
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Associated  to  this  set  of  Legendre  functions  are  functions  of  the  angle  variables 
called  spherical  harmonics.  These  are  defined  by 

IW  6)  :=  cltmeim6P,m( cos<p),  (5.32) 

where  c/,m  is  a  normalization  constant  whose  value  is  not  important  for  us. 

From  (5.31),  using  z  =  cosp  and  1  —  z2  =  sin2  p,  we  can  see  that  Y™  is  a 
polynomial  of  degree  /  in  sin  p  and  cos  p.  This  makes  it  relatively  straightforward 
to  check  that  each  Yjn(p,  6)  is  a  smooth  function  on  S2. 

Lemma  5.6  Suppose  u  e  C2(S2)  is  a  solution  of  the  equation 


(5.33) 


that  factors  as  u(p,  0)  =  v(p)w(0).  Then  up  to  a  multiplicative  constant,  u  is  equal 
to  a  spherical  harmonic  Y™  for  l  e  No  and  m  e  The  corresponding 

eigenvalues  depend  only  on  l, 

A/  =  /(/  +  1), 

and  each  has  multiplicity  2/  +  1. 

Proof  By  (5.29),  the  substitution  u  =  vw  leads  to  the  separated  equation 


sirup  d 
v  dp 


1  d2w 
w  dO 2 


The  continuity  of  u  requires  that  w  be  27r-periodic.  Hence,  for  the  0  equation 


d2w 

~W 


=  KW, 


the  full  set  of  solutions  is  represented  by  w(Q)  =  eim 6  with  n  =  m2  for  m  G  Z. 
With  u(0,  p)  =  vm(p)eim9,  the  eigenvalue  equation  (5.33)  reduces  to 


1  d  (  .  dvm\  ( 

— - (  sin  p - 1  +  I  A  — 

sin  p  dp  \  dp  )  \ 


m 


9  i 

sin  p 


) 


vm  =  0. 


Under  the  substitutions  z  =  cos  p  and  vm(p)  =  /(cos  p),  this  becomes 


(1  -  z2)f"  -  2  zf  +  (a  -  -jf-f  f  =  0, 


which  is  recognizable  as  the  Legendre  equation  (5.30)  with  parameters  m  =  p  and 
A  =  v(v  +  1). 

Although  S2  does  not  have  a  boundary,  use  of  the  coordinate  p  e  [0,  tt]  creates 
artificial  boundaries  at  the  endpoints,  i.e.,  at  the  poles  of  the  sphere.  This  is  analogous 
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to  the  boundary  at  r  =  0  in  Lemma  5.4.  We  need  to  find  solutions  which  will  be 
smooth  at  the  poles. 

It  turns  out  that  for  me  Z,  the  function  Q™(z)  diverges  as  z  — >  1  for  any  v  e  C. 
Similarly,  the  functions  P™(z)  diverge  as  z  — >  —1  except  for  the  special  cases 
Pjm(z )  given  by  (5.31).  In  other  words,  up  to  a  multiplicative  constant  vm(cp)  must 
be  equal  to  Plm( cos  cp)  for  some  l  e  No  with  /  >  \m\.  The  corresponding  solution  u 
is  proportional  to  the  spherical  harmonic  Y™ . 

By  the  identification  v  =  /,  the  eigenvalue  is  given  by  A  =  /(/  +  1).  The  corre¬ 
sponding  multiplicity  is  the  number  of  possible  choices  of  ra  e  namely 

2/  +  1.  □ 

The  spherical  harmonics  appearing  in  Lemma  5.6  give  a  complete  set  of  eigen¬ 
functions  for  A§2,  in  the  sense  that  the  only  possible  eigenvalues  are  /(/  +  1)  for 
l  e  No  and  an  eigenfunction  with  eigenvalue  /(/  +  1)  is  a  linear  combination  of  the 
Y™  for  m  e  To  prove  this  requires  more  advanced  methods  than  we  have 

available  here. 

Example  5.7  In  1 925,  Erwin  Schrodinger  developed  a  quantum  model  for  the  hydro¬ 
gen  atom  in  which  the  electron  energy  levels  are  given  by  the  eigenvalues  of  the 
equation 

=  (5.34) 

on  M3.  (We  have  omitted  the  physical  constants.)  The  eigenfunctions  <fi  are  assumed 
to  be  bounded  near  r  =  0  and  decaying  to  zero  as  r  — >  oo. 

Since  the  term  1/r  is  radial,  separation  of  the  radial  and  angular  variables  is 
possible  in  (5.34).  By  Lemma  5.6,  the  angular  components  are  given  by  spherical 
harmonics.  A  corresponding  full  solution  has  the  form 

<Kr,<p,0)  =  h(r)Y?(<p,0).  (5.35) 

Substituting  this  into  (5.34)  and  using  the  spherical  form  of  the  Laplacian  (5.28) 
gives  the  radial  equation 

h(r)  =  A  h(r).  (5.36) 

One  strategy  used  to  analyze  an  ODE  such  as  (5.36)  is  to  first  consider  the  asymptotic 
behavior  of  solutions  as  r  — >►  0  or  oo. 

Suppose  we  assume  h(r)  ~  ra  as  r  — >►  0.  Plugging  this  into  (5.36)  and  comparing 
the  two  sides  gives  a  leading  term 

—a(a  +  l)r“-2  +  1(1  +  l)ra~2 

on  the  left  side,  with  all  other  terms  of  order  r“-1  or  less.  This  shows  that  h(r )  ~  ra 
as  r  — >►  0  is  possible  only  if 


1  d  /  2  d  \  /(/  +  1)  1 

r2  dr  \  dr  J  r2  r 


5.4  Spherical  Symmetry 


91 


Ql((X  H-  1)  —  /(/  T  1). 

The  two  solutions  are  a  =  l  or  a  =  —l  —  1.  Taking  a  <  0  would  cause  h(r)  to 
diverge  as  r  — >►  0.  Therefore  to  obtain  a  solution  bounded  at  the  origin,  we  will 
assume  that 

h(r)  ~  rl 


as  r  — >►  0. 

As  r  — >  oo,  if  we  consider  the  terms  in  (5.36)  with  coefficients  of  order  r°  and 
drop  the  rest,  the  equation  becomes 


—  h"(r)  -  A h(r).  (5.37) 

If  A  >  0  then  this  shows  that  h(r)  could  not  possibly  decay  at  infinity.  Hence  we 
assume  that  A  <  0  and  set 

a2  :=  —A, 

with  a  e  R.  The  asymptotic  equation  (5.37)  implies  the  behavior 

h(r)  ~  ce~ar 


as  r  oo. 

Determining  these  asymptotics  allows  us  to  make  an  educated  guess  for  the  form 
of  the  solution.  For  an  as  yet  undetermined  function  q{r ),  we  set 


h(r)  =  q(r)rle  ar , 


(5.38) 


with  the  conditions  that  g(0)  =  1  and  q(r)  has  subexponential  growth  as  r  — >  oo. 
The  goal  of  setting  up  the  solution  this  way  is  that  the  equation  for  q(r)  will  simplify. 
Substituting  (5.38)  into  (5.36)  leads  to  the  equation 

rq"  +  2(1  +  /  -  ra)q'  +  (1  -  2a (l  +  1))^  =  0.  (5.39) 

To  find  solutions,  we  suppose  q(r)  is  given  by  a  power  series 

oo 

q{r)  =  2>r*. 

k=0 


with  ao  =  1.  Plugging  this  into  (5.39)  gives 


oo 


k(k  —  1  )dkrk  1  +  2(1  +  /  —  rcr)kakrK  1  +  (1  —  2a  (l  +  l))^r 


k- 1 
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Equating  the  coefficient  of  rk  to  zero  then  gives  a  recursive  relation 

2a  (k  +  /  +  1)  -  1  ^ 

°k+'  ~  (k  +  l)(k  +  2l  +  2)**' 


(5.40) 


If  we  assume  that  the  numerator  of  (5.40)  never  vanishes,  then  the  recursion 
relation  implies  that 


dk  ~ 


(2  a)k 
k\ 


as  k  — >  oo.  This  would  give  g(r)  ~  c£2(jr  as  r  — >  oo,  making  /z(r)  also  grow 
exponentially  as  r  — >  oo. 

The  only  way  to  avoid  this  exponential  growth  is  for  the  sequence  of  a *  to  terminate 
at  some  point,  so  that  g  is  a  polynomial.  The  numerator  on  the  right  side  of  (5.40) 
will  eventually  vanish  if  and  only  if 


1 


for  some  integer  n  >  /  +  1 .  Under  this  assumption  the  sequence  a k  terminates  at 
k  =  n  —  l  —  1.  Since  A  =  —cr2,  this  restriction  on  a  gives  the  set  of  eigenvalues 

A  n  :=  “  y  ^  n  €  N. 

4nz 

This  is  in  fact  the  complete  set  of  eigenvalues  for  this  problem,  given  the  conditions 
we  have  imposed  at  r  =  0  and  r  ->  oo.  With  this  eigenvalue  calculation,  Schrodinger 
was  able  to  give  the  first  theoretical  explanation  of  the  emission  spectrum  of  hydrogen 
gas  (i.e.,  the  set  of  wavelengths  observed  when  the  gas  is  excited  electrically).  The 
origin  of  these  emission  lines  had  been  a  mystery  since  their  discovery  by  Anders 

o 

Jonas  Angstrom  in  the  mid- 19th  century. 

Each  value  of  n  corresponds  to  a  family  of  eigenfunctions  given  by 

ip,  6)  =  rIqnj(r)e~^Y[n(<p,  6), 

for  /  g  {0,  . . . ,  n  —  1},  m  G  {— /,  ...,/}.  Here  qn,i(r )  denotes  the  polynomial  of 
degree  n  —  l  —  1  with  coefficients  specified  by  (5.40).  To  compute  the  multiplicity 
of  A„,  we  count  n  —  1  choices  for  /  and  then  2/  +  1  choices  of  m  for  each  /.  The  total 
multiplicity  is 

n— 1 

^(2/  +  1)  =  n2. 

1=0 


0 
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5.5  Exercises 

5.1  On  the  half-strip  Q  =  (0,  1)  x  (0,  oo)  c  M2,  find  the  solutions  of 

Au  =  0 

that  factor  as  a  product  u(x\,  xi)  =  g(x\)h(x2),  under  the  boundary  conditions 


u( 0,  X2)  =  w(l,  X2)  =  u(x  1,  0)  =  0. 


5.2  The  linear  model  for  vibrations  of  a  rectangular  drumhead  is  the  wave  equa¬ 
tion  (4.30)  with  Dirichlet  boundary  conditions  on  a  rectangle  7Z  :=  [0,  i i]  x  [0, 12]  C 
M2.  Separation  of  variables  leads  to  the  corresponding  Helmholtz  problem 


-Acj)  =  X  cj),  cj)\dn  =  0. 


Find  the  eigenfunctions  of  product  type,  <j>(x  1,  X2)  =  0i(xi)02(*2)>  an<i  the  associ¬ 
ated  frequencies  of  vibration.  For  i \  =  I2,  compare  the  ratios  of  these  frequencies 
to  Table  5.2.  Would  a  square  drum  do  a  better  job  of  producing  a  definite  pitch? 

5.3  The  one-dimensional  heat  equation  for  the  temperature  u(t,  x)  of  a  metal  bar 
of  length  i  is 

du  d2u 
dt  dx2 


for  t  >  0  and  v  e  (0,  i).  (We  will  derive  this  in  Sect.  6.1.)  If  the  ends  of  the  bar  are 
insulated,  then  u  should  satisfy  Neumann  boundary  conditions 


du 

dx 


(t,  0) 


du 

dx 


(t,i)  =  0. 


Find  the  product  solutions  u(t,x )  =  v(t)cj)(x). 
5.4  The  damped  wave  equation  on  Q  C  R”  is 


d2u 

'dt2 


(5.41) 


where  u  e  C2([0,  00)  x  £2)  and  7  >  0  is  a  constant  called  the  coefficient  of  friction. 
Suppose  that  <fi  £  C2(^2)  satisfies  the  Helmholtz  equation  (5.3)  on  Q  with  eigenvalue 
A  >  0,  for  some  appropriate  choice  of  boundary  conditions.  Show  that  that  (5.41) 
has  solutions  of  the  form 


u(t ,  x)  =  cj)(x)eluJt , 


94 


5  Separation  of  Variables 


and  find  the  set  of  possible  values  of  u.  In  particular,  show  that  Im  u  >  0  if  7  >  0, 
which  implies  that  the  solutions  decay  exponentially  in  time.  Does  this  decay  rate 
depend  on  the  oscillation  frequency? 

5.5  Consider  this  example  of  a  nonlinear  diffusion  equation: 


du 

~dt 


A  (u2)  =  0, 


for  t  >  0,  x  e  W1 . 

(a)  Assuming  a  product  solution  of  the  form  u(t,  x)  =  v(t)cj)(x ),  separate  variables 
and  find  the  equations  for  v(t)  and  <p(x). 

r\ 

(b)  Show  that  (j>(x)  =  \x\  solves  the  spatial  equation,  and  find  the  corresponding 
function  v(t)  given  the  initial  condition  v(0)  =  a  >  0.  (Observe  that  the  solution 
“blows  up”  at  a  finite  time  that  depends  on  a.) 

5.6  In  polar  coordinates  for  M2,  define  the  domain 

Q  =  j(r,  6);  0  <  r  <  1,  0  <  0  <  tt /3} , 

which  is  a  sector  within  the  unit  disk.  Find  the  eigenvalues  of  A  on  Q  with  Dirichlet 
boundary  conditions. 

5.7  The  quantum  energy  levels  of  a  harmonic  oscillator  in  W1  are  the  eigenvalues 
of  the  equation 

(—A  +  |x|2)  <j>  =  \<j),  (5.42) 

under  the  condition  that  <l>  e  C2(R")  and  <j>  — »■  0  at  infinity. 

(a)  First  consider  the  case  n  =  1 : 

(-  ^  +  x2^j  4>  =  K(p,  (5.43) 

Substitute 

4>(x)  =  q{x)e~x^2 

into  (5.43)  and  find  the  corresponding  ODE  for  q. 

(b)  Assume  that  the  function  q  from  (a)  is  given  by  a  power  series  in  x, 

oo 

q(x)  =  y ^akxk, 

k= 0 

and  find  a  recursive  equation  for  2  in  terms  of  a 

(c)  Find  the  values  of  k  for  which  the  power  series  for  q  from  (b)  truncates  to  a 
polynomial.  (The  resulting  functions  q  are  called  Hermite  polynomials.) 


5.5  Exercises 
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(d)  Returning  to  the  original  problem,  by  reducing  (5.42)  to  n  copies  of  the  case 
(5.43),  deduce  the  set  of  eigenvalues  A. 

5.8  Let  B3  C  M3  be  the  unit  ball  {r  <  1}.  Consider  the  Helmholtz  problem 


—  Acj)  =  A  (j), 


with  Dirichlet  boundary  conditions  at  r  =  1 . 

(a)  Assume  that 

4>(r,  ip,  0)  =  hi(r)Ylm(ip,  0), 

where  Y ]m  is  the  spherical  harmonic  introduced  in  Sect.  5.4.  Find  the  radial  equa¬ 
tion  for  hi(r). 

(b)  For  /  =  0  show  that  the  radial  equation  is  solved  by 

sin(VAr) 

ho(r)  =  — - 

r 

What  set  of  eigenvalues  A  does  this  give? 

(c)  Show  that  the  substitution, 

hi(r)  =  r“2/,(VAr), 

reduces  the  equation  from  (a)  to  a  Bessel  equation  (5.15)  for  fi(z),  with  a  frac¬ 
tional  value  of  k.  Use  this  to  write  the  solution  hi(r)  in  terms  of  A. 

(d)  Express  the  eigenvalues  A  in  terms  of  Bessel  zeros  with  fractional  values  of  k. 


Chapter  6 

The  Heat  Equation 


In  physics,  the  term  heat  is  used  to  describe  the  transfer  of  internal  energy  within  a 
system  of  particles.  When  this  transfer  results  from  collective  motion  of  particles  in 
a  gas  or  fluid,  the  process  is  called  convection.  The  continuity  equation  developed  in 
Sect.  3.1  describes  convection  by  fluid  flow,  which  is  the  special  case  called  advection. 
Another  form  of  convection  is  conduction ,  where  the  heat  transfer  caused  by  random 
collisions  of  individual  particles. 

The  basic  mathematical  model  for  heat  conduction  is  a  PDE  called  the  heat  equa¬ 
tion ,  developed  by  Joseph  Fourier  in  the  early  19th  century.  In  this  chapter  we  will 
discuss  the  derivation  and  develop  some  basic  properties  of  this  equation,  our  first 
example  of  a  PDE  of  parabolic  type. 


6.1  Model  Problem:  Heat  Flow  in  a  Metal  Rod 


A  metal  rod  that  is  sufficiently  thin  can  be  treated  as  one-dimensional  system.  Let 
u(t,  x)  denote  the  temperature  of  the  rod  at  time  t  and  position  x,  with  i  e  M  for 
now. 

There  are  two  physical  principles  that  govern  the  flow  of  heat  in  the  rod.  The 
first  is  the  relationship  between  thermal  (internal)  energy  and  temperature.  Thermal 
energy  is  proportional  to  a  product  of  density  and  temperature,  by  a  constant  c  called 
the  specific  heat  of  the  material.  Thus,  the  total  thermal  energy  in  a  segment  [a,  b] 
is  given  by 


(6.1) 


We  will  assume  that  the  density  p  is  constant,  although  it  could  be  variable  in  some 
applications. 


The  original  version  of  the  book  was  revised:  Belated  corrections  from  author  have  been  incorpo¬ 
rated.  The  erratum  to  the  book  is  available  at  https://doi.org/10.1007/978-3-319-48936-0_14 


©  Springer  International  Publishing  AG  2016 
D.  Borthwick,  Introduction  to  Partial  Differential  Equations , 
Universitext,  DOI  10.1007/978-3-319-48936-0_6 
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The  second  principle  is  Fourier’s  law  of  heat  conduction,  which  describes  how  heat 
flows  from  hotter  regions  to  colder  regions.  In  its  one-dimensional  form,  Fourier’s 
law  says  that  the  flux  of  thermal  energy  across  a  given  point  is  given  by 


du 


(6.2) 


where  the  constant  k  >  0  is  the  thermal  conductivity  of  the  material. 

Assuming  that  the  rod  is  thermally  isolated,  conservation  of  energy  dictates  the 
rate  of  change  of  the  thermal  energy  within  the  segment  is  equal  to  the  flux  across 
its  boundaries,  i.e., 


dU 

dt 


it)  =  q(t,  a)  -  q(t ,  b). 


(6.3) 


As  in  our  derivation  of  the  local  equation  for  conservation  of  mass,  the  combination 
of  (6.1)  and  (6.3)  yields  an  integral  equation 


Since  a  and  b  were  arbitrary,  this  implies  a  local  conservation  law, 


du  dq 
cp  —  +  — 
dt  dx 


Using  the  formula  for  q  from  Fourier’s  law  (6.2),  we  obtain  the  one-dimensional 
heat  equation : 


du  k  d2u 
dt  cp  dx2 


(6.4) 


For  a  rod  of  finite  length,  the  solution  u  will  satisfy  boundary  conditions  that 
depend  on  how  the  rod  interacts  with  its  environment.  If  the  rod  is  parametrized  by 
v  e  [0,  i ]  and  we  assume  that  each  end  is  held  at  a  fixed  temperature,  then  this  fixes 
the  values  at  the  endpoints, 


u(t,  0)  =  7q,  u(t ,  i)  =  T\ 


(6.5) 


for  all  t.  These  are  inhomogeneous  Dirichlet  boundary  conditions.  In  one  dimension 
the  inhomogeneous  problem  can  be  reduced  very  simply  to  the  homogeneous  case 
by  noting  that 


U0(x)  :=  T0  (l  -  j) 


X 


-  -  +Tl l 


gives  an  equilibrium  solution  to  the  heat  equation  satisfying  the  boundary  condi¬ 
tions  (6.5).  By  the  superposition  principle,  u  —  uo  satisfies  the  heat  equation  with 
homogeneous  Dirichlet  conditions. 

Another  possible  boundary  assumption  is  that  the  ends  are  insulated,  so  that  no 
thermal  energy  flows  in  or  out.  This  means  that  q  vanishes  at  the  boundary,  yielding 
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the  Neumann  boundary  conditions 


du 

dx 


it,  0) 


du 

dx 


(t,i)  =  0. 


Example  6.1  On  the  bounded  interval  [0, 7r],  we  can  find  product  solutions  to  the 
heat  equation  using  Lemma  5.1.  For  the  Dirichlet  boundary  conditions  u( 0)  = 
u(tt)  =  0,  Theorem  5.2  gives  the  set  of  sine  eigenfunctions  (5.5).  The  corresponding 
heat  equation  solutions  are 


_  2j. 

u(t,  x)  =  e~  sin  (nx) 

for  n  g  N.  Note  that  all  of  these  solutions  decay  exponentially  to  0  as  t  — >  oo. 

For  insulated  ends  we  switch  to  Neumann  conditions  and  obtain  the  cosine  modes. 
The  resulting  set  of  solutions  is 

u (t ,  x)  =  e  cos  (nx) 

for  n  G  No.  In  this  case  the  n  =  0  mode  yields  a  constant  solution. 

In  Chap.  8  we  will  discuss  the  construction  of  series  solutions  from  these  trigono¬ 
metric  families.  0 

The  higher  dimensional  form  of  the  heat  equation  can  be  derived  by  an  argument 
similar  to  that  given  above.  In  W1,  the  thermal  flux  q  is  vector  valued,  and  Fourier’s 
law  becomes  the  gradient  formula 


q  =  —kVu. 


Local  conservation  of  energy  is  expressed  by  the  continuity  equation  (3.20), 

du 

cp—  +  v  •  q  =  0. 

at 


In  combination,  these  yield  the  n -dimensional  heat  equation, 


du  k 

- A  u  =  0. 

dt  cp 


(6.6) 


The  importance  of  the  heat  equation  as  a  model  extends  well  beyond  its  original 
thermodynamic  context.  One  of  the  most  prominent  examples  of  this  is  Albert  Ein¬ 
stein’s  probabilistic  derivation  of  the  heat  equation  as  a  model  for  Brownian  motion 
in  1905,  in  one  of  the  set  of  papers  for  which  he  was  later  awarded  the  Nobel  prize. 
Brownian  motion  is  named  for  the  botanist  Robert  Brown,  who  observed  in  1 827  that 
minute  particles  ejected  by  pollen  grains  drifted  erratically  when  suspended  in  water, 
with  a  jittery  motion  for  which  no  explanation  was  available  at  the  time.  Einstein 
theorized  that  the  motion  was  caused  by  collisions  with  a  large  number  of  molecules 
whose  velocities  were  distributed  randomly.  The  existence  of  atoms  and  molecules 
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was  still  unconfirmed  in  1905,  and  Einstein’s  model  provided  crucial  supporting 
evidence. 

To  summarize  Einstein’s  argument,  suppose  that  a  total  of  n  particles  are  dis¬ 
tributed  on  the  real  line.  In  an  interval  of  time  r,  the  position  of  each  particle  is 
assumed  to  change  by  a  random  amount  according  to  a  distribution  function  0.  To 
be  more  precise,  the  number  of  particles  experiencing  a  displacement  between  a  and 
a  +  da  is 

dn  =  ncpia)  dcr. 


The  total  number  of  particles  is  conserved,  which  imposes  the  condition 


4>(a)  da  =  1. 


(6.7) 


The  distribution  of  displacements  is  assumed  to  be  symmetric,  0(a)  =  0(— a), 
meaning  that  particles  are  equally  likely  to  move  left  or  right. 

Suppose  the  distribution  of  particles  at  time  t  is  given  by  a  density  function  pit,  x). 
Under  the  displacement  hypothesis,  the  values  of  this  density  at  times  t  and  t  +  r 
are  related  by 

/< oo 

p(t,  x  —  a)(p(a)  da.  (6.8) 

-oo 

To  find  an  equation  for  p,  Einstein  takes  the  Taylor  expansions  of  the  density  on  both 
sides  of  (6.8),  obtaining 


pit  +  x,  x)  =  pit ,  x)  + 


dp 

~dt 


it,  x)x  +  .  .  . 


(6.9) 


on  the  left,  and 

pit ,  x  —  a)  =  pit,  x) 


dp 

dx 


it,  x)a  + 


1  d2p 

2  9U 


it,  x)a2  +  . . . 


inside  the  integral.  Integrating  the  latter  expansion  against  0  gives,  by  (6.7)  and  the 
assumption  that  0  is  even  (which  knocks  out  the  linear  term), 


l  d2p 


a)(f>ia)  da  =  pit,  x) -{ - xit,x)  I  a  0(<r)  da  +  ... . 


2  dx2 


f 


oo 


oo 


Substituting  this  formula  together  with  (6.9)  into  (6.8)  and  keeping  the  leading  terms 
gives 


dp 

~dt 


it,  x)x 


i  d2p 

Idx2 


I 


OO 


(t,x)  a2(pia)  da. 


oo 


Einstein  then  assumes  that  the  value 
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D  = 


1 

2r 


r\ 

a  0(cr)  da 


(6.10) 


is  a  fixed  constant,  so  that  the  equation  for  p  becomes 


dp 

~dt 


i.e.,  the  heat  equation.  Remarkably,  the  function  0  representing  the  random  distri¬ 
bution  of  displacements  plays  no  role  in  the  final  equation,  except  in  the  value  of 
the  constant  D.  This  fact  is  related  to  a  fundamental  result  in  probability  called  the 
central  limit  theorem. 

Diffusion  models  involving  random  motions  of  particles  are  prevalent  in  physics, 
biology,  and  chemistry.  The  same  statistical  principles  appear  in  other  applications 
as  well,  for  example  in  models  of  the  spread  of  infection  in  medicine,  or  in  the  study 
of  fluctuating  financial  markets.  The  heat  equation  plays  a  fundamental  role  in  all  of 
these  applications. 


6.2  Scale-Invariant  Solution 


Let  us  consider  the  heat  equation  on  R,  with  physical  constants  normalized  to  1 , 


du 


dt 


(6.11) 


Note  that  the  equation  is  invariant  under  the  rescaling  (t,  x)  i->  (X2y,  Xx),  with  X 
a  nonzero  constant.  This  suggests  a  change  of  variables  to  the  scale-invariant  ratio 
y  :=  x  I \ft  might  simplify  the  equation. 

Let  us  try  to  find  a  solution  of  the  form  u(t,x)  =  q(y)  for  t  >  0.  By  the  chain 
rule, 

du  y  r  d2u  1  n 

dt  2t^  ’  dx2  t  ^ 

Thus,  as  an  equation  for  q,  (6.11)  reduces  to  an  ODE, 


n 


This  can  be  solved  for  q'  by  separation  of  variables  for  ODE,  as  described  in  Sect.  2.5, 

q'{y)  =  q\ 0)e_r/4. 


A  further  integration  yields 
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•y 


q(y)  =  q'{ 0)  f  e  y’  /4  dy'  +  c/(0) 

JO 


In  the  original  coordinates  this  translates  to 


u(t ,  x )  = 


=  Ci  e  r/4  dy  +  C2. 
Jo 


It  is  easy  to  confirm  that  this  solves  (6.1 1)  for  f  >  0.  To  see  what  happens  as  t  — >  0, 
note  that 


/»oo 

Jo 


■y2/4 


=  0T, 


by  the  computations  from  Exercise  2.5.  Thus 


lim  u(t,  x )  = 

*-►0 


Cia/tt  +  C2,  V  >  0, 

0,  v  =  0, 

— CiV^  +  6"2,  *  <  0. 


(6.12) 


In  view  of  this  limiting  behavior,  let  us  consider  the  particular  solution  U  defined  by 
setting  Ci  =  -±=,C2  =  l 


U(t,x ) 


(6.13) 


This  solution  is  plotted  for  some  small  values  of  t  in  Fig.  6.1.  By  (6.12),  limr^o 
U(t,x)  =  &(x),  the  Heaviside  step  function  defined  by 


&{x) 


1,  v  >  0, 
x=0, 
0,  v  <  0. 


The  fact  that  U(t,  x)  has  such  a  simple  limit  as  t  ->  0  can  be  used  to  derive  a  more 
general  integral  formula.  Suppose  we  want  to  solve  (6.11)  for  the  initial  condition 


u(t,  x)  =  (p{x) 


Fig.  6.1  The  heat  solution 
U(t,  x)  for  times  from  t  =  0 
to  3 
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with  cp  e  C^t(M).  The  key  observation  is  that  cp  can  be  reproduced  by  integrating  its 
derivative  against  the  Heaviside  function, 


<p\z)S(x 


z)  dz  =  /  <p(z)  dx' 

Joe 

=  <p(x). 


This  suggests  that  we  could  solve  the  heat  equation  with  initial  condition  cp  by  setting 


/oo 

c p\z)U(t,x  —  z)  dz . 

-oo 


For  t  >  0  the  function  U(t,  x  —  z)  is  continuously  differentiable,  so  we  can  integrate 
by  parts  to  rewrite  this  as 


dU 


/oo 

(p(z)1—  (t,x  -  z)  dz 

-oo  dz 


—  /  <p(z)e-{x-z)2/At  dz. 

V  t  J — oo 


(6.14) 


In  terms  of  the  function 


H,(x) 


the  solution  is 


u(t,  x) 


z)<p(z)  dz. 


This  integral  is  called  the  convolution  of  Ht  with  cp  and  denoted  u  =  Ht  *  cp. 

In  the  next  section,  we  will  generalize  this  convolution  formula  to  higher  dimen¬ 
sion,  and  check  that  it  does  yield  a  solution  of  the  heat  equation  satisfying  the  desired 
initial  condition. 


6.3  Integral  Solution  Formula 

Consider  the  heat  equation  on  W1 , 


du 

~dt 


—  Au  =  0 


(6.15) 


for  t  >  0,  with  initial  condition 


u(0,x)  =  g(x) 
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forx  g  R”. 

Inspired  by  the  calculations  in  Sect.  6.2,  we  define 

H,(x)  :=  (4 jit)-"ieM2/4‘  (6.16) 


n 

for  t  >  0.  The  normalizing  factor  (Ant)  2  is  chosen  so  that 


Ht(x)  dnx  =  1. 


(See  Exercise  2.5  for  the  computation  of  integrals  of  this  type.) 
Direct  differentiation  shows  that  for  t  >  0, 


(6.17) 


n~Ht  —  AHt  =  0,  (6.18) 

ot 

so  that  Ht  (x)  is  a  solution  of  the  heat  equation.  However,  the  limit  of  Ht  (x)  as  t  — >  0 
is  0  for  x  7^  0  and  00  for  x  =  0,  which  does  not  seem  to  make  sense  as  a  distribution 
of  temperatures.  (We  will  return  to  discuss  the  interpretation  of  this  initial  condition 
in  Chap.  12.) 

With  (6.14)  as  motivation,  our  goal  in  this  section  is  to  show  that  the  convolution 
u  =  Ht  *  g  satisfies  the  heat  equation  on  R"  for  a  continuous  and  bounded  initial 
condition  g.  A  function  that  acts  on  other  functions  by  convolution  is  an  integral 
kernel ,  and  Ht  is  specifically  called  the  heat  kernel  on  M77.  It  is  also  called  the 
fundamental  solution  of  the  heat  equation,  for  reasons  we  will  explain  in  Chap.  12. 

Theorem  6.2  For  a  bounded  function  g  e  C°(M77),  the  heat  equation 


(Ta)"=° 


W|f=0  =  g. 


(6.19) 


admits  a  classical  solution  given  by 


u(t,  x)  =  Ht  *  g(x).  (6.20) 

Proof  Explicitly,  the  formula  (6.20)  says  that 

u(t,  x)  =  f  e~lx~yl 2/4tg(y)dny.  (6.21) 

J  Rn 

Because  the  domain  is  infinite  here,  we  should  treat  differentiation  under  the  integral 
with  some  care.  To  justify  this,  the  key  point  is  that  the  partial  derivatives  of  Ht  can  be 
estimated  by  expressions  of  the  form  c\ (t,  x)e~C2^t,x^  ,  where  the  dependence  of  the 
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constants  c\  and  c 2  on  t  and  x  is  continuous.  We  will  not  go  into  the  technical  details, 
but  this  makes  it  relatively  straightforward  to  check  that  differentiating  under  the 
integral  works  in  this  case.  In  particular,  the  fact  that  Ht(x)  solves  the  heat  equation 
implies  (6.19). 

To  check  the  initial  condition,  fix  some  x  e  W1 .  A  change  of  variables  to  w  = 
(y  —  x)U~t  in  (6.21)  gives 

u(t,  x)  =  (47r)_2  /  e~\w\  g  (x  +  t*  w  J  dnw 

JR"  '  ' 


=  /  Hi(w)g  (x  +  t2  w)  d11  w 

JR"  ^  ' 


By  (6.17)  we  can  also  write 


-/ 


g(x)  =  /  H\(w)g(x)  d" w . 


Thus  the  difference  we  are  trying  to  estimate  is 

u(t,x)  —  g(x)=  /  H\(w)  g  (x  +  t*w)  —  g(x)  dnw.  (6.22) 

JR"  L  V  / 

Given  s  >  0,  we  can  use  the  exponential  decay  of  H\(w)  as  | u; |  — >  00  to  choose 
R  so  that 


L 

J  \  w \>R 


Hi(w )  dnw  <  e. 


Since  g  is  bounded  by  assumption,  there  exists  a  constant  M  so  that  \g\  <  M.  The 
“large-u;”  piece  of  (6.22)  can  thus  be  estimated  by 


f  H\  (w) 

g  [x  +  ^u,)  -  g(x) 

J  |u;|>/e 

\  / 

dnw  <  2 Ms. 


(6.23) 


On  the  other  hand,  by  the  continuity  of  g  we  can  choose  8  >  0  so  that  for  y  such 
that  for  |x  —  y  \  <8,  we  have 

|g(*  —  y)  —  g(*)l  <  e. 

Thus  for  |  w  |  <  R  and  t  <  S2/R2, 


g  (i+^ffij  -  g(x) 


<  8. 


It  follows  that  for  t  <  82/R2, 
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f  H\  (w) 

g(x  +  t^w)  -g( x) 

/  |ty|</? 

\  / 

<  £ 


L 

L 


dnw  <  £  I  H\(w)  dnw 

\w\<R 


H\(w)  dnw 


(6.24) 


=  £. 


Combining  the  estimates  (6.23)  and  (6.24)  gives 

| u(t,  x )  —  g(x) |  <  (2 M  +  l)e 
for  0  <  t  <  82/R2.  Since  £  was  arbitrary,  this  shows  that 


lim  u(t,  x)  =  g(x), 
o 


□ 


Without  extra  restrictions  on  u ,  the  solution  of  (6.19)  is  not  necessarily  unique. 
However,  if  we  start  from  a  bounded  initial  condition  g ,  then  it  is  physically  reason¬ 
able  to  assume  that  u  is  bounded  over  finite  time  intervals. 

Theorem  6.3  Under  the  assumption  thatu(t,  •)  is  bounded  on  [0,  T ]  x  W1  for  each 
T  >  0,  the  solution  of  the  heat  equation  (6.19)  is  unique. 

We  will  develop  tools  to  prove  this  result  (maximum  principles)  in  Chap.  9.  The 
statement  can  be  improved  by  weakening  the  boundedness  hypothesis  to  an  assump¬ 
tion  of  exponential  growth.  The  counterexamples  to  uniqueness  exhibit  superexpo¬ 
nential  growth  and  are  not  considered  valid  as  physical  solutions. 

In  combination,  Theorems  6.2  and  6.3,  show  that  a  bounded  solution  of  the  heat 
equation  on  W  with  continuous  initial  data  satisfies  (6.21).  The  function  Ht  (x)  is  C°° 
in  both  variables  for  t  >  0.  As  we  noted  in  the  proof  of  Theorem  6.2,  differentiation 
under  the  integral  is  justified  in  (6.21),  so  this  regularity  can  be  extended  to  general 
solutions. 

Theorem  6.4  Suppose  that  u  is  a  bounded  solution  of  the  heat  equation  (6.19)  for 
a  bounded  initial  condition  g  e  C°(R").  Then  u  e  C°°(( 0,  oo)  x  R"). 

Similar  regularity  results  hold  for  the  heat  equation  in  other  contexts,  for  example 
on  a  bounded  domain.  We  will  discuss  some  of  these  cases  later  in  Sect.  8.6.  This 
behavior,  i.e.,  smoothness  of  solutions  that  does  not  depend  on  the  regularity  of  the 
initial  data,  is  characteristic  of  parabolic  equations. 

Another  interesting  feature  of  the  heat  kernel  is  the  fact  that  Ht(x)  is  strictly 
positive  for  all  t  >  0  and  x  This  means  that  if  g  is  nonnegative  and  not  iden¬ 
tically  zero,  then  u  is  nonzero  at  all  points  x  e  for  t  >  0.  Compare  this  to  the 
Huygens  principle  that  we  observed  in  Chap.  4,  which  says  that  for  solutions  of  the 
wave  equation  the  range  of  influence  of  a  point  is  limited  by  the  (finite)  propagation 
speed.  The  heat  equation  exhibits  infinite  propagation  speed. 
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We  can  see  the  origin  of  the  infinite  propagation  speed  in  Einstein’s  diffusion 
model  from  Sect.  6.1.  In  (6.10)  the  value  D ,  which  is  assumed  to  be  constant,  is  the 
average  squared  displacement  per  unit  time.  The  fact  that  D  is  fixed  implies  that  in 
the  continuum  limit  r  ->  0,  the  average  absolute  displacement  per  unit  time  diverges. 
Hence  the  infinite  propagation  speed  is  built  in  to  the  construction  of  the  model.  It 
reflects  the  fact  that  models  of  diffusion  are  inherently  statistical,  and  not  expected 
to  be  accurate  on  a  microscopic  scale. 


6.4  Inhomogeneous  Problem 

Duhamel’s  method,  which  was  used  to  construct  solutions  of  the  inhomogeneous 
wave  equation  in  Sect.  4.4,  was  originally  developed  in  the  context  of  the  heat  equa¬ 
tion.  There  are  slight  differences  in  the  setup,  but  the  basic  idea  of  translating  a 
forcing  term  into  an  initial  condition  applies  in  both  settings. 

Consider  the  equation  on  W1 , 


du 

- Au  =  f 

dt 


(6.25) 


for  t  >  0,  with  initial  condition  u( 0,  x )  =  0.  For  s  >  0,  let  rjs(t,  x )  be  the  solution 
of  the  homogeneous  heat  equation  (6.19)  for  t  >  s,  subject  to  the  initial  condition 


ris(t,x) 


f  (s,x). 


(6.26) 


We  claim  that  the  solution  is  then  given  by  the  integral 


u(t,  x) 


rjs(t ,  x)  ds. 


Using  the  formula  for  rjs  provided  by  Theorem  6.2,  the  proposed  solution  can  be 
written 


u(t,x)=f  f  Ht-S(x  —  y)f  (s,  y)  dny  ds.  (6.27) 

70  JRn 


To  justify  this  formula,  we  must  investigate  carefully  what  happens  near  the  point 
t  =  s. 

Theorem  6.5  Assuming  that  f  e  C2([0,  oo)  x  M'7)  and  is  compactly  supported,  the 
formula  (6.27)  yields  a  classical  solution  to  the  inhomogeneous  heat  equation  (6.25). 

Proof  We  can  see  that  u  is  at  least  C2  by  changing  variables  in  the  integral  formula 
to  obtain  ^ 

u(t,x)=  /  Hs(y)f  (t  —  s,  x  —  y)  dn y  ds. 

Jo  J Rn 
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Since  Hs  (y)  is  smooth  near  s  =  t  and  /  is  compactly  supported,  differentiation  under 
the  integral  is  justified.  This  gives 


du 

(t,  x )  = 

3 1  jo 


-ff 

Jo  Jm 

/ 

J  m 


df 

Hs(y)-^-(t  —  s,  x  —  y)  dny  ds 
ot 

-  y)dny, 

Ot 


(6.28) 


and 


A u(t,  x)  = 


-ff 


Hs(y)Axf(t  -s,x  -  y)  dn y  ds. 


(6.29) 


Our  goal  is  to  integrate  by  parts  in  these  formulas,  to  exploit  the  fact  that  Hs  solves 
the  heat  equation.  Here  we  must  be  careful,  because  of  the  singular  behavior  of  Hs 
at  s  =  0. 

To  deal  with  this  singularity,  we  split  the  integral  at  s  =  s.  For  the  first  integral  in 
(6.28),  switching  the  t  derivative  to  an  s  derivative  and  integrating  by  parts  gives 


ff 

J  £  j  R' 


df 

Hs(y)J-(t  —  s,  x  —  y)  dny  ds 

Ot 


ff 


Hs(y)if(t  -s,x  -  y)  d"y  ds 
ds 


- ff 

/ 

Jm 


3HS 

ds 


(y)f(t  ~s,x  -  y)  dny  ds 


-  Ht(y)f(0,x- y)dny+  /  He(y)f(t-s,x-y)dny 


/ 

Jm 


The  corresponding  result  for  (6.29)  has  no  boundary  terms  because  of  the  compact 
support  of  /, 


ff 

J  £  Jm 


Hs(y)Axf(t  —  s,  x  —  y)  dny  ds 


ff 

J  £  Jm 


AyHs(y)f(t  -  s,  x  -  y)  dn y  ds 


Applying  these  integrations  by  parts  to  (6.28)  and  (6.29),  and  using  the  fact  that 
(Jj  —  A  )HS  =  (f  we  obtain 


d 

~dt 


-  A 


j  u(t,x)  =  / 

J  Jm 


=  I  He(y)f(t  -  e,x  -  y)  dny 

d 


+ 


f  j 

Jo  Jm 


(6.30) 


Hs(y) 


dt 


-A  x  \  f(t  -  s,x  -  y)  dny  ds. 
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Since  Hs  >  0,  the  second  term  can  be  estimated  by 


y)  dny  ds 


where  C  is  the  maximum  value  of  |(Jy  —  A)/|  (which  exists  by  the  assumption 
of  compact  support).  By  (6.17),  the  integral  of  Hs(y)  over  y  e  W1  evaluates  to  1, 
yielding  the  estimate 


jfj(>w(s 


-A*  \  f{t  -  s,  x  -  y )  dny  ds 


<  Cs. 


We  can  therefore  take  s  — >  0  in  (6.30)  to  obtain 


9 

dt 


—  A  )  u{t ,  x)  = 


=  lim  f 
Jr 


He(y)f(t  ~  e,x  -  y)  dny. 


The  remaining  limit  is  very  close  to  the  limit  computed  in  the  proof  of  Theorem  6.2, 
except  that  t  is  replaced  by  t  —  s.  A  simple  modification  of  that  argument  shows  that 

lim  /  He(y)f  (t  -  s,x  -  y)  dny  =  f(t ,  x). 


This  completes  the  proof  that  u  satisfies  (6.25). 


□ 


6.5  Exercises 

6.1  Biological  processes  and  chemical  reactions  are  frequently  described  by 
reaction-diffusion  equations ,  consisting  of  the  heat  equation  modified  by  a  reaction 
term.  Consider  the  simplest  such  equation, 

du 

- 1-  yu  —  A u  =  0. 

dt 

on  W1  with  initial  condition  u(0,  x )  =  f(x).  Assuming  /  is  continuous  and  bounded, 
find  a  formula  for  the  solution.  Hint:  use  a  substitution  of  the  form  u  — >  e~atu  to 
reduce  this  to  the  ordinary  heat  equation. 

6.2  For  t  >  0,  v  >  0,  suppose  that  u(t,  x)  satisfies  the  one-dimensional  heat  equa¬ 
tion  (6. 11)  with  the  initial  condition  u(0,  x)  =  Oforv  >  0  and  the  boundary  condition 


u(t,  0)  =  A  cos  (cot) 


110 


6  The  Heat  Equation 


for  t  >  0.  Under  the  additional  requirement  that  u(t,  •)  is  bounded,  find  a  solution 
u(t,  x).  Hint:  use  separation  of  variables  and  assume  that  the  temporal  components 
have  the  form  e±lcot . 

6.3  Let  Q  c  M77  be  a  bounded  domain  with  piecewise  C1  boundary.  Suppose  that 
u(t,  x)  satisfies  the  heat  equation 


du 

- A  u  =  0, 

dt 


on  (0,  oo )  x  £2.  Following  the  discussion  from  Sect.  6.1,  we  define  the  total  thermal 
energy  at  time  t  by 


U[t]  = 


u(t,  x)  dnx. 


(a)  Assume  that  u  satisfies  Neumann  boundary  conditions, 


du 

dv 


(the  insulated  case).  Show  that  U  is  constant. 

(b)  Assume  that  u  is  positive  in  the  interior  of  Q  and  equals  0  on  the  boundary. 
Show  that  U(t)  is  decreasing  in  this  case. 


6.4  Let  Q  c  M77  be  a  bounded  domain  with  piecewise  C1  boundary.  Suppose  that 
u(t,  x)  is  real- valued  and  satisfies  the  heat  equation 


du 

- A  u  =  0 

dt 


on  (0,  oo)  x  £2 .  Define 

rj(t)  :=  I  u(t,x)2dnx.  (6.31) 

(a)  Assume  that  u  satisfies  the  Dirichlet  boundary  conditions: 

u(t,x) \XedQ  =  0 

for  t  >  0.  Show  that  r\  decreases  as  a  function  of  t. 

(b)  Use  (a)  to  show  that  a  solution  u  satisfying  boundary  and  initial  conditions 

u\t=o  =  g,  u\xedQ  =  h , 

for  some  continuous  functions  g  on  Q  and  h  on  312,  is  uniquely  determined  by 
g  and  h. 


Chapter  7 

Function  Spaces 


In  the  preceding  chapters  we  have  seen  that  separation  of  variables  can  generate 
families  of  product  solutions  for  certain  PDE.  For  example,  we  found  families  of 
trigonometric  solutions  of  the  wave  equation  in  Sect.  5.2  and  the  heat  equation  in 
Sect.  6. 1 .  By  the  superposition  principle,  finite  linear  combinations  of  these  functions 
give  more  general  solutions. 

It  is  natural  to  hope  that  we  could  push  this  construction  farther  and  obtain  solu¬ 
tions  by  infinite  series.  Solutions  of  PDE  by  trigonometric  series  were  studied  exten¬ 
sively  in  the  18th  century  by  d’Alembert,  Euler,  Bernoulli,  and  others.  However, 
notions  of  convergence  were  not  well  developed  at  that  time,  and  many  fundamental 
questions  were  left  open. 

In  this  chapter  we  will  introduce  some  basic  concepts  of  functional  analysis,  which 
will  give  us  the  tools  to  address  some  of  these  fundamental  issues. 


7.1  Inner  Products  and  Norms 

We  assume  that  the  reader  has  had  a  basic  course  in  linear  algebra  and  is  familiar 
with  the  notion  of  a  vector  space ,  i.e.,  a  set  equipped  with  the  operations  of  addition 
and  scalar  multiplication.  The  basic  finite-dimensional  example  is  the  vector  space 
W1 .  This  space  comes  equipped  with  a  natural  inner  product  given  by  the  dot  product 
v  •  w  for  v,  w  e  R”.  The  Euclidean  length  of  a  vector  v  e  R”  is  ||v||  :=  •  v. 

In  this  section  we  will  review  the  corresponding  definitions  for  general  real  or 
complex  vector  spaces,  which  include  function  spaces.  One  important  set  of  exam¬ 
ples  are  the  spaces  Cm  (£2)  introduced  in  Sect.  2.4,  consisting  of  m- times  continuously 
differentiable  complex- valued  functions  on  a  domain  Q  CM".  Because  differentia¬ 
bility  and  continuity  of  functions  are  preserved  under  linear  combination  and  scalar 
multiplication,  Cm(E2)  is  naturally  a  complex  vector  space. 

An  inner  product  on  a  complex  vector  space  V  is  a  function  of  two  variables, 

U,  V  £  V  I — ^  (u ,  V )  £  (C, 
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satisfying  the  following  properties: 

(11)  Positive  definiteness:  (v,  v)  >  0  for  v  e  V,  with  equality  only  if  v  =  0. 

(12)  Symmetry:  for  v,  w  e  V, 

(v,  w)  =  (w,  v). 

(13)  Linearity  in  the  first  variable:  for  G  C  and  Vi,  v2 ,  w  e  V, 

{c  1  V\  +  c2v2,  w)  =  Cl  (V\,  w)  +  c2  { v2 ,  w) . 

Together,  (12)  and  (13)  imply  conjugate  linearity  in  the  second  variable, 

{w,  C\V\  +  c2v2)  =  c[  (w,  Vi)  +  c~2  {w,  v2) . 


The  combination  of  linearity  and  conjugate  linearity  in  the  respective  variables  is 
called  sesquilinearity.  In  the  real  case,  the  complex  conjugation  can  be  omitted, 
reducing  sesquilinearity  to  bilinearity. 

An  inner  product  space  is  a  real  or  complex  vector  space  V  equipped  with  an 
inner  product  (•,  •).  The  Euclidean  inner  product  on  Cn  is  defined  by  including  a 
conjugation  in  the  dot  product, 


(v,  w)  :=  v  •  w.  (7.1) 

One  way  to  define  an  inner  product  on  function  spaces  is  by  integration.  For  example, 
on  C°[0,  1]  we  could  take 

(f,g):=[  fgdx. 

Jo 

Certain  geometric  notions  are  carried  over  from  Euclidean  geometry  to  inner  product 
spaces.  For  example,  vectors  u,  v  in  an  inner  product  space  V  are  called  orthogonal  if 

(u,  v)  =  0. 

The  analog  of  length  for  vectors  in  V  is  called  a  norm.  A  norm  is  a  function 
INI  :  V  — >  R  satisfying  the  following  properties:  for  all  u,  v  e  V  and  scalar  A, 

(Nl)  Positive  definiteness:  \\u\\  >0  with  equality  only  if  u  =  0. 

(N2)  Homogeneity:  ||Am||  =  |A|  \\u\\. 

(N3)  Triangle  inequality:  \\u  +  v\\  <  \\u\\  +  ||u||. 

For  an  inner  product  space,  the  definition  of  the  Euclidean  length  in  terms  of  the  dot 
product  suggests  that  the  function 


(7.2) 


should  yield  a  norm. 
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Positive  definiteness  of  (7.2)  clearly  follows  from  positive  definiteness  of  the 
inner  product,  and  homogeneity  follows  from  sesquilinearity.  To  see  that  (7.2)  also 
satisfies  the  triangle  inequality,  we  present  a  relation  first  derived  in  the  Euclidean 
case  by  the  great  19th  century  analyst  Augustin-Louis  Cauchy;  Hermann  Schwarz 
later  generalized  the  result  to  inner  product  spaces. 

Theorem  7.1  (Cauchy-Schwarz  inequality)  For  an  inner  product  space  V  with  ||  •  || 
defined  by  (7.2), 


|(l>,  w)  |  <  ||u||  II  w 


for  all  v,  w  e  V. 

Proof  For  v,  w  e  V  and  t  e  R,  consider  the  function 


q(t)  :=  v  +  t  (v,  w)  w 


The  claimed  inequality  is  trivial  if  w  =  0,  so  assume  w  7^  0.  By  (12),  (13),  and  (7.2), 

q(t)  =  l\V  +  t  (v,  w)  w,  v  +  t  (v,  w)  wj 

=  ||  v  || 2  +  2 1  |(u,  w)  |2  +  t2  |(u,  w)  |2  ||  w  || 2  . 


_ r\ 

w  ||  .  Since  q  >  0, 


which  gives  the  claimed  inequality.  □ 

The  triangle  inequality  for  (7.2)  follows  from  the  Cauchy-Schwarz  inequality  by 

1 1  9  / 

\\u  +  v\\  =  {u  +  V,  u  +  v) 

=  \\u\\2  +  2Re  (m,  v)  +  ||n||2 

<  \\u\\2  +  2  \(u,  u)|  +  ||n||2 

<  \\u\\2  +  2  ||m||  ||u||  +  ||n||2 

=  ( \M\  +  ll^ll)2  . 


The  minimum  of  this  quadratic  polynomial  occurs  at  to  =  — 


0  <  q(t0)  =  \\v\\2  - 


|(n,  w)[ 


w 


Thus  (7.2)  defines  a  norm  associated  to  the  inner  product.  This  definition  of  the  norm 
is  used  by  default  on  an  inner  product  space. 
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It  is  possible  to  have  a  norm  that  is  not  associated  to  an  inner  product.  For  example, 
this  is  the  case  for  the  sup  norm ,  defined  for  /  e  C°(£2),  with  Q  C  R”  bounded,  by 

sup  \f(x)\  :=  sup  { | / (x) | ;  x  e  Q}.  (7.3) 

We  will  explain  how  to  tell  that  a  norm  does  not  come  from  an  inner  product  in  the 
exercises. 


7.2  Lebesgue  Integration 


In  the  early  20th  century,  Henri  Lebesgue  developed  an  extension  of  the  classic 
definition  of  the  integral  introduced  by  Bernhard  Riemann  in  1854.  (Riemann’s  is 
the  version  commonly  taught  in  calculus  courses.)  Lebesgue ’s  definition  agrees  with 
the  Riemann  integral  when  the  latter  exists,  but  extends  to  a  broader  class  of  integrable 
functions. 

A  full  course  would  be  needed  to  develop  this  integration  theory  properly.  In  this 
section,  we  present  only  a  brief  sketch  of  the  Lebesgue  theory,  with  the  focus  on  the 
features  most  relevant  for  applications  to  PDE. 

The  Lebesgue  integral  is  based  on  a  generalized  notion  of  volume  for  subsets  of 
R” ,  which  can  be  defined  in  terms  of  approximation  by  rectangles.  For  a  rectangular 
subset  in  7Z  C  R”,  let  vol(lZ)  denote  the  usual  notion  of  volume,  the  product  of 
the  lengths  of  the  sides.  (It  is  conventional  to  use  “volume”  as  a  general  term  when 
the  dimension  is  arbitrary.)  The  volume  of  a  subset  A  C  R”  can  be  overestimated 
by  covering  the  set  with  rectangles,  as  illustrated  in  Fig.  7.1.  The  (ft -dimensional) 
measure  of  A  is  defined  by  taking  the  infimum  of  these  overestimates, 


mn(A)  :=  inf 


oo 


oo 


2>ol(7^);  A  c  \Jtzj 

j= 1  i= 1 


(7.4) 


For  a  bounded  region  with  C1  boundary,  the  definition  (7.4)  reproduces  the  notion  of 
volume  used  in  multivariable  calculus.  Note  that  the  concept  of  measure  is  dependent 
on  the  dimension.  The  measure  of  a  line  segment  in  M1  is  the  length,  but  a  line  segment 
has  measure  zero  in  R”  for  ft  >  2. 

There  is  a  major  technicality  in  the  application  of  (7.4).  In  order  to  make  the 
definition  of  measure  consistent  with  respect  to  basic  set  operations,  we  cannot  apply 
it  to  all  possible  subsets  of  M77 .  Instead,  the  definition  is  restricted  to  a  special  class 
of  measurable  sets.  Lebesgue  gave  a  criterion  for  measurability  that  rules  out  certain 
exotic  sets  for  which  volume  is  ill-defined.  Fortunately,  these  sets  are  so  exotic  that 
we  are  unlikely  to  encounter  them  in  normal  usage.  All  open  and  closed  sets  in  M77 
are  included  in  the  measurable  category,  as  are  any  sets  constructed  from  them  by 
basic  set  operations  of  union  and  intersection. 
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Fig.  7.1  Covering  a  set  with 
rectangles 


The  characteristic  function  of  a  set  A  C  R”  is  defined  by 


Xa(x)  := 


x  e  A, 
otherwise. 


(7.5) 


The  measure  can  be  used  to  define  the  integral  of  a  characteristic  function, 


Xa  d’’x  :=  mn(A), 


provided  A  is  a  measurable  set.  The  integral  of  a  general  function  is  then  built 
from  approximations  by  linear  combinations  of  characteristic  functions.  In  order  to 
construct  these  approximations,  we  need  to  use  a  restricted  class  of  functions.  A 
function  /  :  Q  — >  C  is  called  measurable  if  the  preimage  is  a  measurable 

subset  of  £2  for  every  rectangle  7Z  C  C.  Every  Riemann-integrable  function  is 
measurable  in  the  Lebesgue  sense,  so  the  measurable  class  includes  all  functions 
encountered  in  a  traditional  calculus  class.  Henceforth,  whenever  we  write  /  :  Q  — > 
C  or  R,  we  will  assume  implicitly  that  /  is  measurable. 

With  this  basic  picture  in  mind,  we  will  ask  the  reader  to  accept  certain  important 
consequences  of  the  Lebesgue  definition  without  further  justification.  In  examples 
and  exercises  we  will  confine  our  attention  to  functions  for  which  ordinary 
Riemannian  integrals  exist. 

It  is  standard  practice  when  working  with  function  spaces  related  to  integration 
to  make  an  equivalence: 

f  =  g  /  =  g  except  on  a  set  of  measure  zero.  (7.6) 

Lor  example,  in  R  the  characteristic  functions  of  the  intervals  ( a ,  b)  and  [a,  b]  are 
equivalent.  In  measure  theory,  a  property  is  said  to  hold  almost  everywhere  if  it  fails 
only  on  a  set  of  measure  zero.  The  equivalence  (7.6)  amounts  to  identifying  functions 
that  agree  almost  everywhere. 
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If  functions  /  and  g  satisfy 


f  \f~g\  dnx  =  0, 

then  there  is  no  way  to  distinguish  them  in  terms  of  integration.  The  definition  (7.6) 
is  motivated  by  the  following: 

Lemma  7.2  For  measurable  functions  f,g:£2^C  with  Q  C  R”, 


if  and  only  if  f  =  g. 


g  |  dnx  =  0 


7.3  Lp  Spaces 


A  function  /  :  Q  ->  C  is  defined  to  be  integrable  if  its  integral  converges  absolutely, 


i.e. 


dnx  <  oo. 


For  p  >  1,  we  define  the  space  of  “p -integrable”  functions  by 


LP(Q)  := 


/ 


d  x  <  oo 


(7.7) 


with  the  understanding  that  functions  in  Lp  are  identified  according  to  the  equivalence 
(7.6).  The  space  Lp(£2)  is  clearly  closed  under  scalar  multiplication.  Closure  under 
addition  is  a  consequence  of  the  convexity  of  the  function  v  i->  \x\p  for  p  >  1, 
which  implies  the  inequality 


f  +  g 
2 


P  <\fl±jgl_ 
2 


Hence  Lp(£2)  is  a  complex  vector  space  for  p  >  1. 
The  Lp  norm  is  defined  by 


\\fwP  ■■=  (j  i  f\p  dnxy . 

To  check  that  this  is  really  a  norm,  we  first  note  that  Lemma  7.2  implies  positive 
definiteness  (Nl)  because  of  the  equivalence  relation  (7.6).  Homogeneity  (N2)  is 
satisfied  because  of  cancellation  between  the  powers  p  and  1  / p. 
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Fig.  7.2  Step  function 


a 

7 

The  triangle  inequality  (N3)  is  clear  for  p  =  1  because  \f  +  g\ 
for  p  =  2  it  follows  from  Cauchy-Schwarz  inequality,  because  li¬ 
the  inner  product 

(f,g)  ■=  [  fgdnx. 

J  Q 

In  general  the  Lp  triangle  inequality, 

\\f  +  g\\p<  \\f\\p  +  \\g\\p, 

is  called  the  Minkowski  inequality  and  holds  for  p  >  1 .  We  omit  the  proof  because 
we  are  mainly  concerned  with  the  cases  L1  and  L2. 

Example  7.3  To  illustrate  the  distinction  between  the  Lp  norms,  consider  the  func¬ 
tion 

h  :=  ax[0,i]. 


<  I/I  +  \g\-  And 
2  is  associated  to 

(7.8) 


for  a,  l  >  0,  as  shown  in  Fig.  7.2. 
For  general  p  >  1 , 


=  all/p. 


(7.9) 


If  we  think  of  h  as  a  density  function,  then  the  L 1  norm  gives  the  total  mass  ||/;  ||  |  =  al. 
The  sensitivity  of  ||-||p  to  the  spread  of  the  function  decreases  as  p  increases,  as 
illustrated  by  the  fact  that 

lim  \\h\\  =  a , 

p — >oo  ^ 

For  large  p ,  the  Lp  norms  increasingly  become  measures  of  local  concentration 
rather  than  mass.  0 

Example  7.3  suggests  the  possibility  of  defining  a  space  L°°  that  is  a  limiting  case 
of  the  Lp  spaces,  with  a  norm  that  generalizes  the  sup  norm  (7.3).  The  sup  norm  itself 
does  not  respect  the  equivalence  (7.6),  so  we  must  modify  the  definition  to  define  a 
norm  consistent  with  the  other  Lp  spaces. 
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For  a  function  h  :  Q  — >  R,  the  essential  supremum  is 

ess-sup(/t)  :=  inf  ja  e  R;  {/t  >  a}  has  measure  zeroj.  (7.10) 

Note  that  {h  >  a}  has  measure  zero  precisely  when  h  is  equivalent  to  a  function 
bounded  by  a.  The  value  ess-sup (h)  is  thus  the  least  upper  bound  among  all  functions 
equivalent  to  h.  For  continuous  functions  the  essential  supremum  reduces  to  the 
supremum. 

For  /  :  Q  ->  C,  we  define 


ll/lloo  :=  ess-sup  I/I .  (7.11) 

The  normed  vector  space  L°°(£2)  consists  of  functions  which  are  ‘‘essentially 
bounded”, 

L°°(f2 )  :=  {/  :  Q  C;  \\f\H  <  (7.12) 


subject  to  the  equivalence  (7.6). 

Collectively,  the  Lp  spaces  play  a  vital  role  in  the  analysis  of  PDE.  The  different 
norms  can  be  thought  of  as  a  collection  of  measuring  tools.  Although  the  full  toolkit 
is  needed  for  many  applications,  for  this  book  we  will  rely  on  the  cases  p  =  1,2, 
or  oo. 

Example  7.4  The  Schrodinger  equation  in  M77  describes  the  evolution  of  a  quantum- 
mechanical  wave  function  x): 


In  Exercise  4.7  we  saw  that  solutions  have  constant  spatial  L 2  norms, 


Hit,  -)ll2 


Hi  o,  Oil 


2  > 


which  corresponds  to  the  conservation  of  total  probability.  On  the  other  hand,  solu¬ 
tions  also  satisfy  a  dispersive  estimate 

Hit,  OIL  <  cr"/2 1|^(0,  -)lli. 


for  al  It  >  0,  with  C  a  dimensional  constant.  The  norm  on  the  left  measures  the  peak 
amplitude  of  the  wave.  By  the  estimate  on  the  right,  this  amplitude  is  bounded  in 
terms  of  the  mass  and  decays  as  a  function  of  time.  In  general,  dispersive  estimates 
describe  the  spreading  of  solutions  as  a  function  of  time.  0 

It  is  conventional  to  represent  elements  of  Lp  as  ordinary  functions,  even  though 
each  element  is  actually  an  equivalence  class  of  functions  identified  under  (7.6). 
This  usually  causes  no  trouble  because  equivalent  functions  give  the  same  results  in 
integrals. 
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One  point  that  requires  clarification,  however,  is  the  issue  of  continuity  or  differ¬ 
entiability  of  functions  in  Lp .  Under  (7.6),  a  Cm  function  is  equivalent  to  a  class  of 
functions  which  are  not  even  continuous.  To  account  for  this  technicality,  we  adopt 
the  convention  that  if  a  function  in  Lp  is  equivalent  to  a  continuous  function,  then  the 
continuous  representative  is  used  by  default.  This  is  unambiguous  because  the  con¬ 
tinuous  representative  is  unique  when  it  exists.  Under  this  convention,  the  statement 
that  /  e  Lp  is  a  Cm  function  really  means  that  /  admits  a  continuous  representative 
which  is  Cm . 


7.4  Convergence  and  Completeness 

In  a  normed  vector  space  V,  convergence  of  a  sequence  vn  — >  v  means 

lim  \\vn  —  v\\  =  0.  (7.13) 


We  might  also  write  this  as 


v  =  lim  vn , 


provided  the  choice  of  norm  is  clear. 

It  frequently  proves  useful  to  approximate  Lp  functions  by  smooth  functions.  For 
p  >  1  there  is  a  natural  inclusion 

C~  (tf )  C  LP(F2), 

because  continuous  functions  on  a  compact  set  are  bounded.  The  Lebesgue  theory 
gives  the  following: 

Theorem  7.5  Assume  1  <  p  <  oo.  For  a  function  f  e  Lp(§2)  there  exists  an 
approximating  sequence  C  C^t(^2),  such  that 

lim  ||  ipk  -  f  ||  =  0. 

k—*oo 

A  subset  W  of  a  normed  vector  space  V  is  called  dense  if  every  v  e  V  can  be 
obtained  as  a  limit  of  a  sequence  in  W.  Theorem  7.5  thus  states  that  is  dense 

in  LP(C2)  for  p  e  [1,  oo). 

In  PDE  applications,  a  common  method  of  proving  the  existence  of  a  solution  is  to 
construct  a  sequence  of  approximate  solutions,  and  then  establish  convergence  of  this 
sequence  with  respect  to  an  appropriate  norm.  We  cannot  simply  use  the  definition 
(7.13)  to  check  convergence  in  this  situation,  because  the  limiting  function  may  not 
exist.  It  is  therefore  crucial  to  be  able  to  deduce  convergence  using  only  the  sequence 
itself. 
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Fig.  7.3  A  Cauchy  sequence 
with  respect  to  ||  •  ||  x 


The  most  useful  tool  for  this  purpose  is  a  slightly  weaker  form  of  convergence.  A 
sequence  {i^}  C  V  is  said  to  be  Cauchy  if  the  difference  between  elements  converges 
to  zero:  given  e  >  0  there  exists  an  N  such  that  k,  m  >  N  implies 

II  Vk  Vm  II  <'  £• 


This  Cauchy  condition  is  sometimes  written  as  a  double  limit, 


lim  || Vk  —  vm\\  —  0. 

k ,  m  — >  oo 


Every  convergent  sequence  is  Cauchy.  This  is  because  the  triangle  inequality 
implies 

II  Vk  -  Vm\\  =  \\vk  -v  +  v-  vm  || 

<  II  Vk  -  v\\  +  \\v  -  vm  ||  • 


If  the  sequence  converges  then  the  terms  on  the  right  are  arbitrarily  small  for  k  and 
m  sufficiently  large. 

In  W1 ,  it  follows  from  the  completeness  axiom  for  real  numbers  that  all  Cauchy 
sequences  are  convergent.  (See  Theorem  A. 3.)  This  property  does  not  necessarily 
hold  in  a  general  normed  vector  space,  as  the  following  demonstrates. 

Example  7.6  Consider  the  space  C°[— 1,  1]  equipped  with  the  L1  norm  ||  •  ||  x .  For 
n  e  N  define  the  functions 


fn(x)  = 


-1, 

nx 

1, 


*  <  -7’ 

n  —  —  n  ’ 

X  >  ~, 
n  ’ 


as  illustrated  in  Fig.  7.3. 

We  can  see  that  the  sequence  {fn}  is  Cauchy  by  computing 


1  1 

k  m 


l  — 


dx 
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However,  for  /  e  C°[— 1,  1], 


3  ,1 

\f  +  \  \  dx  +  /  |/  —  \  \  dx. 

l  Jo 

This  limit  equals  0  only  if  f(x)  =  —  1  for  v  <  0  and  f(x)  =  1  for  v  >  0.  That 
is  not  possible  for  /  continuous.  Therefore  the  sequence  {fn}  does  not  converge  in 
C°[-l,l].  0 

A  normed  vector  space  V  is  complete  if  all  Cauchy  sequences  in  V  converge 
within  V .  Theorem  A. 3  implies  that  Euclidean  W1  is  complete  in  this  sense.  For  Lp 
spaces  the  Lebesgue  integration  theory  gives  the  following  result. 

Theorem  7.7  For  a  domain  Q  C  R",  the  normed  vector  space  Lp(£2)  is  complete 
for  each  p  e  [l,  oo]. 

In  functional  analysis,  a  complete  normed  vector  space  is  called  a  Banach  space 
and  a  complete  inner  product  space  is  called  a  Hilbert  space .  Thus  Theorem  7.7  could 
be  paraphrased  as  the  statement  that  Lp(f2)  is  a  Banach  space.  The  inner  product 
space  L2(f2)  is  a  Hilbert  space. 

A  subspace  W  C  V  is  closed  if  it  contains  the  limit  of  every  sequence  in  W  that 
converges  in  V. 

Lemma  7.8  If  V  is  a  complete  normed  vector  space  and  W  C  V  is  a  closed  sub¬ 
space,  then  W  is  complete  with  respect  to  the  norm  of  V. 

Proof  Suppose  {u^}  C  IT  is  a  Cauchy  sequence.  The  sequence  is  also  Cauchy  in 
V,  and  so  converges  to  some  v  e  V  by  the  completeness  of  V.  Since  W  is  closed, 
veW.  □ 

The  Lp  function  spaces  have  discrete  counterparts,  denoted  by  i p ,  whose  elements 
are  sequences.  To  a  sequence  (rq,  <22,  •  •  • )  of  complex  numbers  we  associate  the 
function  a  :  N  — >  C  defined  by  j  \->  a j .  The  lp  norm  of  this  function  is 


lim  ||/*  -  f\\i  = 


•00 


a 


r  OO 


z 


The  corresponding  vector  spaces  are 


£P(N)  :=  {a  :  N  -*  C;  \\a\\lP  <  oo},  (7.14) 

for  p  >  1.  It  is  possible  to  prove  directly  that  lp( N)  is  complete,  but  this  can  also  be 
deduced  easily  from  Lemma  7.8.  We  interpret  lp( N)  as  a  closed  subspace  of  Lp{ R) 
consisting  of  functions  which  are  constant  on  each  interval  [j,  j  +  1)  for  j  e  N 
and  zero  on  (—00,  0).  On  this  subspace  the  Lp  norm  reduces  to  the  i p  norm,  so  that 
Lemma  7.8  implies  that  lp( N)  is  complete.  In  particular,  l2( N)  is  a  Hilbert  space 
with  the  inner  product 
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oo 

(a,  b)  1 2  ^  cij  bj . 

7  =  1 


7.5  Orthonormal  Bases 

Let  H  be  an  infinite-dimensional  complex  Hilbert  space.  A  sequence  of  vectors 
{e\,  e2,  .  •  C  H  is  orthonormal  if 


{ej,ek)  = 


1,  j  =  k. 
0,  j  t £  A:, 


(7.15) 


for  all  j,  k  e  1 1.  An  orthonormal  basis  for  //  is  an  orthonormal  sequence  such  that 
each  v  e  H  admits  a  unique  representation  as  a  convergent  series, 


OO 


V 


=  Z 

7  =  1 


C7^7’ 


(7.16) 


with  Cy  G  C. 

As  we  will  see  in  Sect.  7.6,  the  sets  of  eigenfunctions  of  certain  differential  opera¬ 
tors  naturally  form  orthonormal  sequences  with  respect  to  the  L 2  inner  product.  For 
example  the  sine  eigenfunctions  appearing  in  Theorem  5.2  have  this  property.  If  a 
sequence  of  eigenfunctions  forms  a  basis,  then  we  can  expand  general  functions  in 
terms  of  eigenfunctions. 

Suppose  we  are  given  an  orthonormal  sequence  [ej  j  C  //,  and  we  would  like  to 
show  that  this  forms  a  basis.  To  represent  an  element  v  e  H  in  the  form  (7.16),  we 
must  decide  how  to  choose  the  coefficients  Cj .  This  works  in  much  the  same  way  as 
it  does  in  finite  dimensions.  By  the  orthonormality  property  (7.15),  we  can  compute 
that 

n  \ 

ek)  =  Ck  (7.17) 

f=i  ' 

for  all  n  >  k.  Assuming  that  converges  to  v  in  H ,  we  can  take  the  limit 

n  — >  oo  in  (7.17)  to  compute 

(v,ek)=ck.  (7.18) 


Based  on  this  calculation,  we  assign  coefficients  to  v  by  setting 


c/M  ’=(v,ej )■ 


(7.19) 
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The  corresponding  partial  sums  for  n  e  N  are  denoted  by 


S„[u]  y.Cjtukj.  (7.20) 

j= i 


The  condition  that  {ej}  is  a  basis  is  equivalent  to  the  convergence  of  Sn[v ]  ->  v  in 
//  for  every  v  e  H . 


Theorem  7.9  (Bessel’s  inequality)  Assume  that  {ej }  is  an  orthonormal  sequence  in 

i  2 

an  infinite-dimensional  Hilbert  space  H.  For  v  e  H,  the  series  ^  |  Cj  [u]  converges 
and  the  limit  satisfies 


oo 

z 

7  =  1 


k/M 


<  \\v 


Equality  holds  if  and  only  if  Sn[v]  — >►  v  in  H. 

Proof  Using  the  sesquilinearity  (13)  of  the  inner  product,  we  can  expand 


v  -  Sn[v] 


:=  (v  -  SJu],  v  -  5„[i;]) 

=  {V,  V)  -  (Sn[vl  V)  ~  (V,  sn[v])  +  (sn[vl  SJu]) , 


for  n  e  N.  By  the  definition  (7.20)  of  S„[n]  and  the  orthonormality  condition  (7.15), 

n 

v)  =  (■ V ,  5„[u]>  =  (S„M,  ^  k,'M|2. 

7  —  1 


We  thus  conclude  that 


n 


v  -  S„ [v] 


-ZM1’] 

7  =  1 


Since  the  left-hand  side  is  positive,  the  identity  (7.21)  shows  that 


n 


ZM«i 

7  =  1 


2  ii  1 1 2 

<  \\v\r 


(7.21) 


i  1 2 

for  all  n  e  N.  The  partial  sums  of  the  series  ^  iC/Ml  are  thus  bounded  and  the 
terms  are  all  positive.  Hence  the  series  converges  by  the  monotone  sequence  theorem, 
to  a  limit  satisfying  the  claimed  bound, 
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oo 


X  k>'l 


2 


2  =  1 


< 


To  complete  the  proof,  note  that  Sn[v]  — >►  n  in  //  means  that  the  limit  as  n  >  oo 
of  the  left-hand  side  of  (7.21)  is  zero.  Hence  S/?[n]  — >  12  if  and  only  if 


n 


lim  ^ 

2=1 


k/M 


•  00 


□ 


The  combination  of  completeness  and  Bessel’s  inequality  leads  to  an  alternative 
characterization  of  a  basis  that  is  easier  to  apply. 

Theorem  7.10  Suppose  H  is  an  infinite -dimensional  Hilbert  space.  An  orthonor¬ 
mal  sequence  in  H  forms  a  basis  if  and  only  if  0  is  the  only  element  of  H  that  is 
orthogonal  to  all  vectors  in  the  sequence. 

Proof  Assume  first  that  {ej}  forms  a  basis,  so  that  every  v  e  H  can  be  written  as  a 
convergent  sum  ^  cj[v]ej.  If  v  is  orthogonal  to  all  of  the  vectors  ej,  then  cj  [v]  =  0 
for  all  j  by  (7.19).  Hence  v  =  0. 

To  establish  the  converse  statement,  let  {ej}  be  an  orthonormal  sequence.  For 
v  e  H,  Bessel’s  inequality  implies 


X \cj[v] 


2 


2=1 


<  llnll2  <  00. 


(7.22) 


For  n  <  m, 


^m[n]  Sn  [n] 


X  cJ[v]eJ 

j=n+ 1 


2 


X  I  cj  1  v  1 

j=n+l 


Hence  (7.22)  implies  that 


lim 

m,n— »oc 


Sm  It] 


Sn[v] 


meaning  that  the  sequence  {Sn  [n]}  is  Cauchy  in  H.  By  completeness  of  H  this  implies 
that  Sn[v ]  — >  v  for  some  v  e  H. 
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Now  assume  that  0  is  the  only  vector  orthogonal  to  ej  for  all  j.  For  n  >  j  we 
have 


)- 

=  Cj[v]  -  Cj[v] 

=  0. 


[S„[v],  ej ) 


{v  -  S«[u],  ej)  =  ( v ,  ej 


Taking  the  limit  as  n  — >  oo  with  j  fixed  gives 

(n  —  v,  ej)  =  0. 

Thus  v  —  v  is  orthogonal  to  every  ej ,  implying  that  v  =  v.  This  proves  that  Sn  [n]  — >►  v 
in  H  for  each  v  6  H,  and  thus  { ej }  is  a  basis.  □ 


7.6  Self-adjointness 

The  process  of  forming  a  basis  from  eigenvectors  of  an  operator  should  be  familiar 
from  linear  algebra;  for  a  finite-dimensional  matrix  this  is  called  diagonalization. 
Let  us  briefly  recall  the  basic  facts  for  the  finite-dimensional  case.  A  complex  n  x  n 
matrix  A  is  self-adjoint  (also  called  Hermitian)  if  the  matrix  is  equal  to  its  conjugate 
transpose.  In  terms  of  the  Euclidean  inner  product  (7.1)  this  means  precisely  that 

(. Au ,  v)  =  (m,  Av)  (7.23) 

for  all  u,  v  e  Cn.  (In  the  real  case  self-adjoint  is  the  same  as  symmetric.) 

The  spectral  theorem  in  linear  algebra  says  that  for  a  self-adjoint  matrix  A  there 
exists  an  orthonormal  basis  for  C”  consisting  of  eigenvectors  for  A,  with  real  eigen¬ 
values.  Functional  analysis  allows  a  powerful  extension  of  this  result,  that  applies 
in  particular  to  certain  differential  operators  acting  on  L2  spaces.  The  full  spectral 
theorem  for  Hilbert  spaces  is  too  technical  for  us  to  state  here,  but  we  will  prove  a 
version  of  this  for  the  Laplacian  on  bounded  domains  later  in  Sect.  1 1.5. 

Self-adjointness  remains  important  as  a  hypothesis  for  the  more  general  spectral 
theorem,  but  even  this  condition  becomes  rather  technical  in  the  Hilbert  space  set¬ 
ting.  The  issues  arise  from  the  fact  that  differentiable  operators  cannot  act  on  the 
whole  space  L2(^2)  because  L2  functions  need  not  be  differentiable.  We  will  avoid 
these  complexities,  by  focusing  on  the  Laplacian  and  restricting  our  attention  to  C2 
functions. 

Lemma  7.11  Suppose  that  Q  e  M77  is  a  bounded  domain  with  C1  boundary.  If 
u,  v  e  C2(fi)  both  satisfy  either  Dirichlet  or  Neumann  boundary  conditions  on  d£2, 
then 


{Au,  v)  =  («,  Av)  . 


(7.24) 
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Proof  By  Green’s  first  identity  (Theorem  2.10), 


p  -1 

dnx  =  [ 

dv 

du 

uAv  —  vAu 

u  — — 

—  v—~ 

JdQ 

dv 

dS. 


The  Dirichlet  conditions  require  that 


(7.25) 


u\dQ  =  v\oq  =  0, 


implying  the  vanishing  of  the  right-hand  side  of  (7.25).  Similarly,  the  Neumann 
conditions 


du 

do 


8C2 


dv 

do 


(7.26) 


also  imply  that  the  integrand  on  the  right  vanishes. 


□ 


Boundary  conditions  for  which  (7.24)  holds  are  called  self-adjoint  boundary  con¬ 
ditions  (for  the  Laplacian).  Formally,  (7.24)  resembles  the  matrix  condition  (7.23), 
but  of  course  there  is  no  analog  of  boundary  conditions  in  the  matrix  case.  The  proper 
definition  of  self-adjointness  in  functional  analysis  involves  a  more  precise  speci¬ 
fication  of  the  domain  on  which  A  acts  and  (7.24)  holds.  Even  without  going  into 
these  details,  we  can  still  draw  some  meaningful  conclusions  from  Lemma  7.11. 

Lemma  7.12  Suppose  {A7}  is  a  sequence  of  eigenvalues  of  —A  on  a  bounded 
domain  Q  C  R”,  with  eigenvectors  in  C2(f2)  subject  to  a  self-adjoint  boundary 
condition.  Then  A  j  e  R  and,  after  possible  rearrangement,  the  eigenvectors  form  an 
orthonormal  sequence  in  L2(T2). 

Furthermore,  A  j  >  0  for  Dirichlet  conditions,  and  Xj  >  0  for  Neumann. 

Proof  Suppose  we  have  a  sequence  [fj]  C  C2(£2)  satisfying 


Afj  —  A  jfj. 


The  condition  (7.24)  implies  that  for  j,  k  e  Z, 


(a <j)j,  <pk)  =  A4>k) 


By  the  eigenvalue  property  this  reduces  to 


(a j  -  a *)  [4>j,  4>k)  =  o. 


(7.27) 


For  j  =  k  the  inner  product  equals 


>  0,  implying  that  A,-  e  R  for  all  j. 


We  can  thus  drop  the  conjugation  in  (7.27).  If  A  j  ^  A*,  then  this  now  implies  that 

j  5  fk)  — 
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If  some  of  the  A  j  ’s  are  equal,  then  every  linear  combination  of  the  corresponding 
of  eigenfunctions  will  still  be  an  eigenfunction  for  the  same  value  of  A  j .  Hence  we 
can  rearrange  the  eigenfunctions  sharing  a  common  eigenvalue  into  an  orthogonal 
set  using  the  Gram-Schmidt  procedure  from  linear  algebra. 

By  multiplying  the  eigenfunctions  by  constants  we  can  normalize  so  that 
1.  The  divergence  theorem  (Theorem  2.6)  then  implies 


A j  —  (  <fij ) 

V6j\  2dnx-[  6~j-7±dS. 

Jon  dv 

Either  Dirichlet  or  Neumann  conditions  will  cause  the  second  term  to  vanish,  imply¬ 
ing  that  A  j  >  0.  If  A  j  =  0  then  the  equation  also  shows  that  V0y  =  0,  implying  that 
4>j  is  constant.  In  the  Dirichlet  case  the  only  constant  solution  is  trivial,  =  0,  but 
for  Neumann  conditions  a  nonzero  constant  is  possible.  □ 

Example  7.13  In  Example  5.5  we  found  a  set  of  eigenfunctions  for  a  circular  drum¬ 
head  modeled  by  the  unit  disk,  with  Dirichlet  boundary  conditions.  The  eigenfunc¬ 
tions  were  given  in  polar  coordinates  by 

4>k,m(r,  0)  :=  elkeJk(jk<mr),  k  eZ,m  e  N, 


where  jk,m  is  the  rath  positive  zero  of  the  Bessel  function  J \.  The  eigenvalues  of  —  A 
in  this  case  are  the  values  jk,m  •  Since  the  only  possible  matches  among  the  Bessel 
zeros  are  j^m  =  j-k,m>  these  are  the  only  potential  non-orthogonal  pairs. 

Let  us  examine  the  orthogonality  condition  more  explicitly.  In  polar  coordinates, 
the  L2  inner  product  of  two  eigenfunctions  is  given  by 


0)(j)k'jn'(r,  0)  r  dO  dr 


el(k  k')9JkUk,mr)Jk'Uk',m'r)  r  dO  dr. 


Note  that  the  eigenfunctions  are  clearly  orthogonal  when  k  7^  kf ,  because  the  6 
integral  vanishes  in  this  case.  If  we  set  k  =  k' ,  then  the  6  integral  is  trivial  and  the 
inner  product  becomes 


[(Pk,m'. 1  4>k,m ') 


L 2 


r  Jk(jk,mr )Jk(jk,m'r )  dr. 


By  Lemma  7.12  this  integral  vanishes  for  ra  7^  ra;.  The  cancellations  occur  because 
of  the  oscillations,  just  as  for  sine  functions,  as  Fig.  7.4  illustrates. 
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Fig.  7.4  Radial  components 
J\  (ji,mr )  of  orthogonal 
eigenfunctions  on  the  disk 


-0.5 


7.7  Exercises 


7.1  A  norm  ||  •  ||  on  a  vector  space  V  satisfies  the  parallelogram,  law  if 

II  |  1 1 2  i  ||  ||  2  _  ||  1 1 2  i  o  II  1 1 2 

\\v  +  w\\  +  i )  —  w\\  =2\\v\\  -\-2\\w\\  , 


for  all  v,  w  e  V. 


(a)  Show  that  a  norm  defined  by  an  inner  product  as  in  (7.2)  satisfies  the  parallelo¬ 
gram  law. 

(b)  In  LP(R ),  define  the  functions 

/CO  =  X[0,2],  g(x)  =  Xto.i]  -  X[i, 21- 

Use  these  to  show  that  the  parallelogram  law  fails  for  ||  •  ||  if  p  ^  2. 

(c)  Find  an  example  to  show  that  the  parallelogram  law  fails  for  the  sup  norm  (7.3). 


7.2  Consider  the  sequence  of  functions  on  R  defined  by 


ne 


2 

—rrx 


x  >  0, 

x  <  0. 


Show  that  fn  — >  0  in  Z/(M)  but  not  in  L2( R). 

7.3  Consider  the  sequence  of  functions  on  R  defined  by 

gn(x)  =  n~lX[0,n]- 

Show  that  gn  — >  0  in  L2(M)  but  not  in  L!(M). 
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7.4  Consider  the  sequence  fn(x)  =xnforx  e  (0,  1).  Show  that  fn  — >►  0  in  1/(0,  1) 
for  each  p  e  [1,  oo),  but  not  for  p  =  oo. 

7.5  Assume  that  c  M77  is  a  bounded  domain.  Show  that  there  is  a  constant  C  >  0 
such  that  for  /  e  L2(£?), 

Il/lli  <C||/||2. 

This  implies  in  particular  that  L2(£?)  C  Ll(Q).  Find  an  example  to  show  that  this 
result  does  not  hold  for  Q  unbounded. 


7.6  As  an  application  of  the  Cauchy-Schwarz  inequality,  we  can  use  the  quantity 
rj  defined  in  Exercise  6.4  to  show  that  solutions  of  the  heat  equation  with  fixed 
boundary  values  are  uniquely  determined  by  the  values  at  time  t  —  T  >0.  Under 
the  hypotheses  from  that  exercise,  suppose  that  u  solves  the  heat  equation  with 

u\ t=T  =  0,  u\xeQQ  =  0. 


The  goal  is  to  show  that  these  assumptions  imply  u  —  0  for  all  t, 

(a)  Use  the  Cauchy-Schwarz  inequality  to  deduce  that 


T]'(t)2  <  477(0 


Q 


du 

~dt 


dnx 


where  p  is  defined  as  in  (6.31). 

(b)  Show  that 

y\t)  =  4  f 
J  Q 

so  that  the  inequality  from  (a)  becomes 


du 

~dt 


2 

dnx , 


7(02  <  t?(0V(0- 


(7.28) 


(c)  Suppose  that  p(0)  >  0.  Then  by  continuity  log?7(0  is  defined  at  least  in  some 
neighborhood  of  t  =  0.  Using  (7.28),  show  that 

(log  77(0)"  >  0. 

This  implies  that  log  p(t)  is  bounded  below  by  its  tangent  lines.  In  particular 

7/(0) 

log  rj(t)  >  log  77(0)  +  — —  t, 

77(0) 
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which  implies 

V(t)  >  me~ct , 

for  c  =  —7]'(0)/rj(0)  >  0.  Thus  if  77(C))  >  0  then  rj  is  strictly  positive  for  all 
t  >  0. 

(d)  Conclude  from  (c)  that  if  rj (T)  =  0,  then  pit)  =  0  for  all  t,  and  deduce  that 
u  =  0. 


7.7  Recall  the  radial  decomposition  formula  (2.10).  We  can  use  this  to  get  a  basic 
picture  of  the  degree  of  singularity  or  decay  at  infinity  that  is  allowed  in  each  Lp . 

(a)  For  7  e  R  consider  the  function 


g(x) 


r  <  1, 

r  >  1. 


For  what  values  of  7  and  e  [1,  00]  is  g  e 
(b)  For  7  e  R  consider  the  function 


h(x ) 


0,  r  <  1, 
r7,  r  >  1. 


For  what  values  of  7  and  p  e  [1,  00]  is  h  e  Lp(Rn) ? 


7.8  Consider  the  eigenfunctions  given  by  (5.5)  with  l  =  tt. 


(a)  Show  that 


[2  . 

0„(x)  :=  J  —  sin (nx),  n  e  N, 


defines  an  orthonormal  sequence  in  L2( 0, 7r).  (Hint:  recall  the  trigonometric 
identity  sin(ct)  sin(^)  =  ^[cos(ct  —  (3)  —  cos(ct  +  /?)].) 

(b)  For  the  function  u  =  1,  compute  the  corresponding  expansion  coefficients, 


ck[  1]  :=  (1,  fa)  • 


Under  Theorem  7.9,  what  explicit  summation  condition  corresponds  to  the  con¬ 
vergence  ^77  [1]  — >  1  in  L2(0, 7r) ? 


Chapter  8 

Fourier  Series 


In  his  study  of  heat  flow  in  1807,  Fourier  made  the  radical  claim  that  it  should  be 
possible  to  represent  all  solutions  of  the  one-dimensional  heat  equation  by  trigono¬ 
metric  series.  As  we  noted  in  the  introduction  to  Chap.  7,  trigonometric  series  had 
been  studied  earlier  by  other  mathematicians.  Fourier’s  innovation  was  to  suggest 
that  the  general  solution  could  be  obtained  this  way. 

This  claim  proved  difficult  to  resolve,  because  the  tools  of  functional  analysis  that 
we  discussed  in  Chap.  7  were  not  yet  available  in  Fourier’s  time.  Indeed,  the  difficult 
problem  of  Fourier  series  convergence  provided  some  of  the  strongest  motivation  for 
the  development  of  these  tools. 

In  this  chapter  we  will  analyze  Fourier  series  in  more  detail,  and  show  that  the 
Fourier  approach  yields  a  general  solution  for  the  one-dimensional  heat  equation.  The 
primary  significance  of  this  approach  to  PDE  is  the  philosophy  of  spectral  analysis 
that  it  inspired.  The  decomposition  of  functions  with  respect  to  the  spectrum  of 
a  differential  operator  is  a  tool  with  enormous  applications,  both  theoretical  and 
practical. 


8.1  Series  Solution  of  the  Heat  Equation 

Consider  the  heat  equation 

du 

- A  u  =  0,  (8.1) 

dt 

on  a  domain  Q  cl",  with  Dirichlet  or  Neumann  boundary  conditions.  According 
to  Lemma  5.1  the  product  solutions  of  (8.1)  have  the  form 

u(t,  x )  =  v(t)cp(x), 
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where  0  solves  the  Helmholtz  equation 


—  A  0  =  A0 


on  Q .  The  temporal  equation  is  a  simple  ODE, 


~dt 


with  the  family  of  solutions 

v(t)  =  v(0)e~Xt. 


(8.2) 


Let  us  assume  that  the  equation  for  0  admits  a  sequence  of  solutions  0^,  with 
eigenvalues  A*.  We  have  seen  specific  examples  of  this  in  Chap.  5,  including  the 
trigonometric  case  in  Theorem  5.2.  By  (8.2),  the  corresponding  product  solutions  of 
the  heat  equation  are 

uk(t,  x)  :=  e~Xkt 4>k(x) . 

Fourier’s  strategy  calls  for  us  to  express  the  general  solution  as  a  series, 

oo 

u(t,x )  =  y '  ane~Kt (j)n{x) ,  (8.3) 

n—  1 


for  some  choice  of  coefficients  an.  To  fix  the  coefficients  an  in  (8.3)  we  assume  an 
initial  condition  u( 0,  x)  =  h(x).  Setting  t  —  0  gives 


oo 

h(x)  =  y an<j>„(x).  (8.4) 

n— 1 


If  we  can  show  that  {00}  forms  an  orthonormal  basis  of  L2(^2),  then  this  gives  us 
a  way  to  assign  coefficients  to  h  such  that  (8.4)  holds,  at  least  in  the  sense  of  L 2 
convergence. 

Even  if  the  orthonormal  basis  property  is  established,  some  big  issues  still  remain. 
The  fact  that  each  term  u ^  satisfies  the  heat  equation  does  not  guarantee  that  u  does, 
because  of  the  infinite  series  summation.  Similarly,  the  limit  of  (8.3)  as  t  — >►  0 
is  not  necessarily  (8.4),  because  the  limit  cannot  necessarily  be  taken  inside  the 
summation.  In  this  chapter  we  will  explain  how  to  resolve  these  problems  in  the 
context  of  trigonometric  series. 


Example  8. 1  Consider  the  case  of  a  one-dimensional  metal  rod  with  insulated  ends. 
For  convenience  take  the  length  to  be  it ,  so  that  v  e  [0 ,  it  ]  and  the  Neumann  boundary 
conditions  are 


du 

dx 


(CO) 


du 

dx 


(t,  7 r)  =  0. 
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Let  us  consider  the  initial  condition 

h(x)  ~  3nx2  —  2x3 , 


(8.5) 


as  pictured  on  the  left  in  Fig.  8.1. 

The  boundary  conditions  are  satisfied  by  cosines, 

<fin(x)  =  cos (nx),  n  e  No. 

Hence  the  strategy  outlined  above  calls  for  us  to  represent  the  initial  condition  as  a 
series 

oo 

h(x)  =  ^^ancos(nx).  (8.6) 

n—  0 


To  choose  the  coefficients,  we  recall  the  discussion  of  basis  expansion  from 
Sect.  7.5.  The  cosines  satisfy  an  orthogonality  condition  with  respect  to  the  L2  inner 
product  on  [0, 7r], 


*7 T 


cos  (mi)  cos(nx)  dx 


0  m  7^  n, 

■  7i  m  =  n  =  0,  . 
7r/2  m  =  n  >  1, 


(8.7) 


This  could  be  checked  with  trigonometric  identities,  but  it  is  perhaps  easier  is  to  use 
the  complex  form  cos (kx)  =  \{elkx  +  e~lkx). 

Since  the  sequence  {cpn}  is  not  normalized,  the  coefficient  formula  (7.18)  must  be 
interpreted  as 


dn 


( h ,  (pn ) . 


By  (8.7)  the  Fourier  coefficients  are  thus  given  by 


1  r 

ciq  =  —  /  h(x)  dx , 

n  Jo 

2  f71 

an  =  —  h(x)  cos(nx)  dx,  n  >  1. 

n  Jo 

After  substituting  (8.5)  into  (8.8),  integration  by  parts  yields 


an 


71 

2  ’ 


48 


Tin 


4  5 


(B.8) 


n  =  0, 
n  >  1,  odd, 
n  >  2,  even. 


(8.9) 
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Fig.  8.1  Comparison  of  the  initial  condition  h(x )  with  the  first  two  terms  of  its  cosine  series 


Figure  8.1  shows  a  comparison  between  h  and  the  partial  sum  S\  [h] .  The  close  match 
between  these  functions  is  clearly  evident.  And  since  the  higher  coefficients  decay 
by  a  factor  of  n~4,  convergence  of  this  series  seems  quite  plausible.  The  resulting 
solution  would  be  given  by 


u(t,  x ) 


£ 

n  GNodd 


48 

Tin4 


cos  (nx). 


Note  that  the  convergence  rate  improves  dramatically  as  t  increases. 


0 


8.2  Periodic  Fourier  Series 

We  saw  examples  of  Fourier  series  based  on  sines  in  Theorem  5.2  and  cosines  in 
Example  8. 1 .  To  account  for  both  cases,  it  is  convenient  to  consider  periodic  functions 
on  R.  We  define 

T  :=R/(2ttZ),  (8.10) 

where  the  quotient  notation  means  that  points  separated  by  an  integer  multiple  of 
2tx  are  considered  equivalent.  The  space  Cm  (T)  consists  of  the  functions  in  Cm  (R) 
which  are  27r -periodic. 

Integrals  of  functions  on  T  are  defined  by  restricting  the  range  of  integration  to 
an  arbitrary  interval  of  length  2n  in  R.  We  will  write  the  inner  product  on  L2(T)  as 

(f,g)  =  [  fgdx, 

J  —71 

but  the  range  of  integration  could  be  shifted  if  needed. 
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The  Helmholtz  equation  on  T  is 


a20 

dx2 


for  0  g  C2(T),  with  no  need  for  additional  boundary  conditions  because  of  the 
periodicity.  The  eigenfunctions  are  the  complex  exponentials 


<Pk(x)  ■■=  e 


ikx 

5 


(8.11) 


for  k  G  Z,  with  Xk  =  k2. 

It  is  possible  to  recover  cosine  and  sine  Fourier  series  from  the  periodic  case, 
by  restricting  our  attention  to  even  or  odd  functions  on  T.  We  will  demonstrate  this 
specialization  in  the  examples  and  exercises. 

The  complex  exponentials  satisfy  a  simple  orthogonality  relation, 

{</>*,  <t>i)  =  [  e,(k~l)x  dx 
J  —71 

2tt,  k  =  /, 

“  0,  k^l. 


r\ 

We  did  not  include  a  normalizing  factor  in  (8.11),  so  ||0^||  =  2n  and  the  Fourier 
coefficients  of  an  integrable  function  /  g  L 1  (T)  are  defined  by 


cdf]  := 


dx. 


(8.12) 


Because  the  index  set  is  Z  rather  than  N,  we  define  the  partial  sums  of  the  periodic 
Fourier  series  by  truncating  on  both  sides, 


n 

S„[f](x)  :=  Y,  Cklf]eikx.  (8.13) 

k=—n 


For  the  sequence  {(pk},  Bessel’s  inequality  (Theorem  7.9)  takes  the  form 


keZ 


1 

<  - 

2n 


(8.14) 


with  equality  if  and  only  if  Sn[f]  — >  /  in  L2(T). 
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In  the  specific  example  considered  in  Example  8.1,  the  Fourier  series  appeared 
to  converge  very  quickly.  To  illustrate  potential  complications  with  the  convergence, 
let  us  consider  a  function  with  a  jump  discontinuity. 

Example  8.2  On  the  interval  [0,  7r],  define  the  function 


h(x)  = 


x  €  [0,  §  ], 
x  e  (§,  7 r], 


as  pictured  on  the  left  in  Fig.  8.2.  As  noted  above,  in  order  to  represent  h  as  a  cosine 
series  using  the  periodic  eigenfunctions,  we  first  extend  h  to  T  as  an  even  function, 


i.e., 


h(x) 


x  e  [-§,  f]  +  2ttZ, 
v  e  (§,  +  2ttZ. 


By  (8.12),  with  a  shift  to  the  more  convenient  interval  [0,  27t],  the  Fourier  coefficients 
of  h  are 


ck[h]  = 


dx 


l 


2 

(-D 


7i  k 


k  =  0, 
sin  (Jy)  ,  k  7^  0. 


Since  C-k[h]  =  Ck[h],  we  can  combine  terms  in  the  partial  sums  (8.13)  to  give 


1  "  (-1)*  /jtk\ 

Sn[h](x)  =  -  +2^ — —  sin  I  —  j  cos(kv). 

Figure  8.2  shows  a  sample  of  these  partial  sums.  In  contrast  to  the  case  of  Exam¬ 
ple  8.1,  where  2  terms  of  the  Fourier  series  were  enough  to  give  a  very  convincing 
approximation,  we  can  see  significant  issues  with  convergence  in  the  vicinity  of  the 
jump,  even  with  40  terms.  0 

The  Fourier  series  computed  in  Example  8.2  makes  for  a  good  illustration  of  some 
different  notions  of  convergence.  Consider  the  sequence  of  differences  h  —  Sn[h],  as 


Fig.  8.2  Fourier  series  expansion  for  a  step  function 
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0.5 


-0.5 


n  =  100 


111  . 


7T 


Fig.  8.3  Plots  of  the  differences  h  —  Sn  [h] 


illustrated  in  Fig.  8.3.  If  {0^}  forms  an  L2(T)  basis,  then  we  would  have 


lim 

n^oo 


h  -  Sn[h] 


It  is  not  easy  to  judge  such  a  limit  visually,  but  this  claim  is  true,  as  we  will  prove  in 
Sect.  8.6. 

We  could  instead  focus  our  attention  the  values  of  Sn[h](x)  for  some  fixed  v.  A 
sequence  of  functions  fn  is  said  to  converge  pointwise  to  /  (assuming  these  functions 
have  a  common  domain)  if  for  each  fixed  v  in  the  domain, 


lim  fn(x)  =  f(x). 

n^oo 

In  Fig.  8.3,  if  we  focus  our  attention  on  some  point  v  away  from  the  center,  then  the 
bumps  at  this  point  do  seem  to  be  decreasing  in  size  as  n  gets  larger.  We  will  verify 
in  Sect.  8.3  that  this  Fourier  series  converges  pointwise  except  at  v  =  |. 

Another  feature  that  is  quite  apparent  in  Fig.  8.3  is  the  spike  near  the  center.  It  is 
possible  to  prove  that  such  a  spike  persists,  with  height  essentially  constant,  for  all 
values  of  n.  The  historical  term  for  this  effect,  which  is  caused  by  the  jump  discon¬ 
tinuity,  is  the  Gibbs  phenomenon.  It  was  actually  first  observed  in  1848  by  Henry 
Wilbraham,  but  remained  generally  unknown  until  it  was  rediscovered  independently 
by  J.  Willard  Gibbs  in  1899. 

The  Gibbs  phenomenon  relates  to  yet  a  third  definition  of  convergence.  A  sequence 
of  bounded  functions  fn  is  said  to  converge  uniformly  to  a  function  /  on  a  set  W  if 


lim  sup 

n^°°  xeW 


In  the  cases  plotted  in  Fig.  8.3  we  can  see  that 


sup 


h(x) 


S„[h](x) 


1 

2  ’ 


(8.15) 


Since  this  does  not  decrease,  uniform  convergence  fails  for  this  series.  However,  the 
sequence  does  converge  uniformly  on  domains  that  exclude  a  neighborhood  of  the 
jump  point,  for  example  on  the  interval  [0,  |  —  s]  for  s  >  0. 
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8.3  Pointwise  Convergence 


The  basic  theory  of  pointwise  convergence  of  Fourier  series  was  worked  out  by 
Dirichlet  in  the  mid- 19th  century.  In  this  section  we  will  establish  a  criterion  for 
pointwise  convergence  of  periodic  Fourier  series. 

Theorem  8.3  Suppose  f  e  L2  (T),  and  that  for  xeT  the  estimate, 


ess-sup 

ye[-s,s] 


fix)  -  f(x  -  y) 

y 


<  00, 


(8.16) 


holds  for  some  e  >  0.  Then 


lim  Sn[f ](x)  =  fix). 

n^oo 

The  essential  supremum  was  defined  in  (7.10).  The  inequality  (8.16)  means  that, 
after  possibly  replacing  /  by  an  equivalent  function  in  the  sense  of  (7.6),  we  can 
assume  that 

sup 

0<\y\<s 


fix)  -  fix  -  y) 

y 


<  00. 


(8.17) 


This  bound  holds  automatically  for  /  e  C:(T),  by  the  estimate 


/(*)  -  fix  -  y) 

y 


i  r  , 

-  f  (0  dt 

y  Jx-y 

<  sup  |  fit) 

te  T 


Thus  Theorem  8.3  shows  that  the  Fourier  series  a  C1  function  converges  pointwise 
on  all  of  T.  The  same  argument  can  be  extended  to  functions  on  T  which  are  merely 
piecewise  C1. 

It  is  possible  to  prove  pointwise  convergence  with  a  weaker  hypothesis  than 
that  of  Theorem  8.3.  However,  there  are  counterexamples,  discovered  by  Fejer  and 
Lebesgue,  that  show  that  pointwise  convergence  of  the  Fourier  series  may  fail  for 
/  €  C°(T). 

Before  getting  into  the  proof  of  Theorem  8.3,  let  us  consider  the  structure  of  the 
partial  sums  in  more  detail.  Plugging  the  coefficient  formula  (8.14)  into  (8.13)  gives 


(8.18) 
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The  function  that  appears  in  parentheses  is  called  the  Dirichlet  kernel , 


n 


Dn(t)  2n  X  e 

k=—n 


ikt 


(8.19) 


With  this  definition  the  formula  for  the  partial  sum  becomes 


■ 2n 

Sn[f](x)  =  I  f(y)Dn(x  -  y)  dy, 
ro 


(8.20) 


This  could  be  written  as  a  convolution. 


Sn[f]  =  f*Dn. 


Because  the  sum  (8.19)  is  finite,  it  is  clear  that  the  Dirichlet  kernel  is  a  smooth 
function  on  T.  It  is  also  easy  to  compute  that 


2jt 

Dn(t)  dt  =  1  (8.21) 

for  n  e  N,  since  only  the  k  =  0  term  in  (8.19)  contributes  to  the  integral.  Applying 
the  polynomial  identity 


1  +Z  +  Z2  +  ---  +  zm 


Zm+1  -  1 

z  —  1 


to  (8.19)  with  z  =  elt  gives  the  explicit  formula 


Dn(t) 


gi(n  +  \)t  _  g—int 

2n  elt  —  1 


(8.22) 


Factoring  elT ^  out  of  the  numerator  and  denominator  reduces  this  to 


Dn(t) 


1  sin((ft  +  \)t) 
2n  sin(|t) 


which  makes  it  clear  that  Dn  is  real- valued. 

A  plot  of  Dn  (y)  for  various  values  of  n,  as  shown  in  Fig.  8.4,  gives  some  intuition 
as  to  why  we  might  expect  (8.23)  to  converge  to  f(x)  as  n  — >►  oo.  The  function 
Dn(y)  concentrates  at  y  =  0,  and  oscillates  with  increasing  frequency  away  from 
this  point.  These  oscillations  will  cause  cancellation  as  n  — >  oo,  except  at  y  =  0. 

Proof  of  Theorem  8.3  Because  both  /  and  Dn  are  periodic,  a  change  of  variables 
y  — >  v  —  y  allows  us  to  rewrite  the  convolution  in  the  opposite  order: 
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Fig.  8.4  The  Dirichlet  kernel  for  increasing  values  of  n 


Sn[f](x)=f  Dn(y)f  (x  —  y)  dy.  (8.23) 

J  —71 


Thus,  by  (8.21)  and  (8.23), 

fix)  -  S„[f](x )  =  f  [f(x)  -  f(x  -  y)}Dniy)  dy. 

J  —71 


Substituting  in  with  the  explicit  formula  (8.22)  for  Dn(t)  gives 


fix)  -  .S'„ [/](*)  = 


1  r  /(^)  -  fix  -  y)  („+!), 


2n  .1 


71 


e‘y  -  1 


—  e 


—my 


]  dy.  (8.24) 


The  crucial  observation  here  is  that  if  we  separate  the  terms  inside  the  brackets,  then 
this  looks  like  a  formula  for  Fourier  coefficients. 

Assuming  that  the  hypothesis  of  the  theorem  is  satisfied  at  v  e  T,  consider  the 
function 


hiy) 


fix)  -  fix  -  y) 
e‘y  -  1 


(8.25) 


defined  for  y  e  T  with  y  ^  0.  We  can  split  this  into  factors  as 


fix 

y 


y)  y 

e‘y  - 1  ’ 


note  that  the  first  factor  is  essentially  bounded  near  y  =  0  by  the  assumption  (8.16). 
Since  eiy  —  1  ~  iy  as  y  — >  0  by  Taylor’s  approximation,  the  second  factor  is  also 
bounded  near  y  =  0.  The  hypothesis  (8.16)  thus  guarantees  that  h(y)  is  equivalent  to 
a  function  that  is  bounded  on  the  interval  [—£,  s].  Since  /  e  L2(T)  and  ( eiy  —  l)-1 
is  bounded  for  y  e  ±  [s,  tt],  we  conclude  from  this  that  h  e  L2(T). 

We  can  thus  interpret  (8.24)  in  terms  of  Fourier  coefficients, 


fix)  -  Sn[f](x)  =  C—n—i  \h\  -  cn[h\.  (8.26) 


Bessel’s  inequality,  which  takes  the  form  (8.14)  here,  implies  that  Ck[h] 
k  — >  zboo.  By  (8.26)  this  establishes  pointwise  convergence  at  v. 


0  as 
□ 


8.4  Uniform  Convergence 
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8.4  Uniform  Convergence 


According  to  the  definition  (8.15)  of  uniform  convergence,  fn—>f  uniformly  on  T 
if 


sup 


fix) 


0 


as  n  — >►  oo.  This  is  closely  related  to  the  convergence  with  respect  the  L°°  norm,  as 
introduced  in  Sect.  7.3.  If  a  sequence  converges  in  the  L°°  sense,  then  after  possibly 
modifying  the  functions  on  a  set  of  measure  zero  we  can  assume  that  the  convergence 
is  uniform. 

Continuity  is  not  necessarily  preserved  under  pointwise  limits.  For  example  the 

2 

sequence  e~nx~  converges  pointwise  on  R  but  the  limit  function  is  discontinuous 
at  v  =  0.  On  the  other  hand,  uniform  convergence  of  continuous  functions  does 
guarantee  continuity. 

Lemma  8.4  Suppose  { fn }  C  C°(£2)  for  a  domain  £2  C  R".  If  {fn}  converges 
uniformly  to  a  function  f  :  £2  —>  R,  then  f  is  also  continuous. 

Proof  The  goal  is  to  show  that  f(y)  can  be  made  close  to  f(x)  by  taking  y  close 
to  x.  To  make  use  of  the  uniform  convergence,  we  note  that  the  triangle  inequality 
implies 


I  f{x)  -  f(y) |  <  f(x)  -  fn(x)  +  fn(x)  -  fn(y)  +  fn(y)  -  f(y)  .  (8.27) 


For  n  large  we  can  control  the  first  and  third  terms  on  the  right  by  the  assumption  of 
uniform  convergence.  To  control  the  middle  term  we  can  use  the  continuity  of  fn . 
Fix  x  e  £2  and  s  >  0.  By  uniform  convergence  there  exists  n  so  that 


sup  |  fniy)  -  /O0|  <  £• 

yeti 


(8.28) 


The  fact  that  fn  is  continuous  at  x  means  that  we  can  find  8  >  0  (depending  on  x) 
such  that  for  y  e  £2  satisfying  |jt  —  y  \  <  8, 


fn(X )  -  fn(y ) 


<  £. 


(8.29) 


Combining  (8.28)  and  (8.29)  with  (8.27)  shows  that  for  y  e  £2  satisfying 
x  -  y\  <  8, 


\f(x)-f(y)\<3e. 


Thus  /  is  continuous  at  x.  □ 

Uniform  convergence  is  particularly  easy  to  check  for  periodic  Fourier  series, 
because  the  eigenfunctions  fk  satisfy 
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Theorem  8.5  For  f  e  C^T),  the  sequence  of  partial  sums  Sn[f ]  convergences 
uniformly  to  f. 

Proof  The  assumption  that  /  e  C1  (T)  implies  f  e  C°(T).  By  integration  by  parts, 


cdf]  = 


1 


*7 r 


2n 


f(y)e  lky  dy 


71 


1  tt 

=  ^f(y)e-ky 

2tc 


ik  r7Z 

Ti  2tc 


f(y)e  lky  dy 


71 


The  boundary  term  cancels  by  periodicity,  leaving 


cdf]  =  ikcdf]. 


(8.30) 


Since  /'  e  L2(T)  also,  applying  Bessel’s  inequality  in  the  form  (8.14)  to  the 
coefficients  (8.30)  implies  that 


X  \kc a/] 

ke  Z 


<  00. 


(8.31) 


Let  f2(Z\{0})  denote  the  discrete  L 2  space  on  the  set  consisting  of  functions 
Z\{0}  — >►  C.  (The  lp  spaces  were  introduced  in  Sect.  7.4.)  The  sequence 


ak  :=  \kcdf]\ 

defines  an  element  of  f2(Z\{0})  by  (8.31).  If  we  define  b  e  f2(Z\{0})  by  bk  :=  k~l, 
then  the  sum  of  the  coefficients  cdf]  with  k  7^  0  can  be  expressed  as  an  l2  pairing, 

2>[/]|  =  (a,  b)t2 . 

k^O 

By  the  Cauchy-Schwarz  inequality  on  f2(Z\{0}), 


(a,b)i 2  <  \\a\y  \\b\y  <  00. 


Since  the  norms  of  a  and  b  are  finite,  we  conclude  that 

^\ck[f]\  <  OO.  (8.32) 

keZ 

Note  that  we  already  know  that  Sn[f]  — >  /  pointwise  by  Theorem  8.10.  This 
implies  that 

S„[f](x)  -  f{x)  =  £  ck[f]Mx) 

\k\>n 
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for  each  x  e  T.  Because  \(/>k(x)\  =  1, 


s„if](x)  -  f(x) |  <  y  M/] 


\k\>n 


(8.33) 


The  right-hand  side  of  (8.33)  is  independent  of  v  and  tends  to  zero  as  n  — >  oo  by 
(8.32),  proving  that  Sn[f]  —>  f  uniformly.  □ 


8.5  Convergence  in  L2 


The  uniform  convergence  provided  by  Theorem  8.5  proves  to  be  very  helpful  in 
resolving  the  L 2  basis  question.  This  is  because  uniform  convergence  on  T  implies 
L 2  convergence  also,  by  the  integral  estimate 


-  /I2  dx 


<  2n  sup 


/(*) 


Hence  Theorem  8.5  gives  convergence  of  Fourier  series  in  L2  for  C1  functions.  In 
this  section  we  will  extend  this  result  to  all  of  L2. 

Theorem  8.6  The  normalized  periodic  Fourier  eigenfunctions 

— Le‘*x,  jfceZ, 

\2jt 

form  an  orthonormal  basis  for  L2(T). 

Proof  Suppose  that  u  e  L2(T)  satisfies 


(w,0*)=  0  (8.34) 

for  each  k  e  Z.  By  Theorem  7.10  the  conclusion  will  follow  if  we  can  deduce  that 
this  implies  u  —  0. 

As  noted  above,  for  f  e  C:(T)  Theorem  8.5  implies  that  Sn[f  ]  — >  f  in  L2(T). 
In  particular,  this  gives 

lim  (u,  Sn[\/r])  =  (m,  f) .  (8.35) 

n^oo 

However,  since  Sn[\/f]  is  a  finite  linear  combination  of  the  </>*,  the  assumption  (8.34) 
implies  that 


(m,  5n[^])  =  0. 
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Hence  from  (8.35)  we  deduce  that 


(m,  f)  =  0. 


Now  recall  Theorem  7.5,  which  says  that  C^t(0,  27r)  forms  a  dense  subset  of 
L2( 0,  2n).  This  implies  also  that  C^T)  is  dense  in  L2(T).  Therefore  we  can  choose 
a  sequence  {^/}  in  C:(T)  such  that  xj/i  — >  u  in  L2(T).  Thus 


w ||2  =  lim  (m,  1/7)  , 


-oo 


and  we  just  showed  that  all  terms  on  the  right  are  zero  under  the  assumption  (8.34). 
Therefore  u  =  0.  □ 


The  combination  of  Theorems  8.6  and  7.9  immediately  yields  the  following: 

Corollary  8.7  (Parseval’s  identity )  For  f  e  L2  (T),  the  periodic  Fourier  coefficients 
satisfy 

^|c*[/]|2  =  i 11/1,12  • 

keZ 

Applying  Parseval’s  identity  to  /  +  g,  where  /,  g  e  L2(T),  and  separating  out 
the  cross-term  yields  the  corresponding  result  for  the  inner  product, 

(/,  g)  =  2tt  ^  ck[f]ck[g].  (8.36) 

keZ 


Example  8.8  In  Example  8.2,  we  found  for  the  step  function  h  that  Ck  \h]  =  =b  ^  for 
k  odd,  co[h]  =  and  otherwise  Ck[h]  =  0.  So  for  this  case, 

xi«wij=:+2  z 

keZ  £eNodd 


r\ 

On  the  other  hand,  \\hW2  =  tv,  so  Parseval’s  identity  implies 


1  2^1 
-  +  —  >  — 


7 x 


^  k2 


k  ^Nodd 


1 

2' 


Thus  we  obtain  the  summation  formula 


k  SNodd 


(8.37) 


0 
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The  space  L2(0,  2tt)  can  be  identified  with  L2(T)  by  extending  functions  peri¬ 
odically.  Hence  Theorem  8.6  also  implies  that  elkx }  is  an  orthonormal  basis  for 
L2(0,  2tt).  We  can  also  specialize  the  periodic  results  to  show  that  cosine  or  sine 
series  give  orthonormal  bases  for  L2(0,  i)  with  basis  functions  that  satisfy  Dirichlet 
or  Neumann  boundary  conditions,  respectively.  We  will  discuss  these  cases  in  the 
exercises. 


8.6  Regularity  and  Fourier  Coefficients 

In  the  preceding  sections  we  have  made  some  progress  in  understanding  the  repre¬ 
sentation  of  a  function  by  Fourier  series.  However,  we  still  have  not  addressed  one  of 
the  primary  questions  raised  in  Sect.  8.1:  when  does  a  Fourier  series  yield  a  classical 
solution  to  the  original  PDE?  In  this  section  we  will  resolve  this  issue  by  studying 
the  relationship  between  the  regularity  of  a  function  and  the  decay  of  its  Fourier 
coefficients. 

The  starting  point  for  this  discussion  is  the  computation  used  in  the  proof  of 
Theorem  8.5, 

cdf]  =  ikck[f ] 

for  /  g  C^T).  Repeating  this  computation  inductively  gives  the  following: 
Lemma  8.9  Suppose  that  f  e  Cm(T).  Then 

Ck[f(m) ]  =  ( ik)mck[f ].  (8.38) 

To  describe  the  decay  rates  of  coefficients,  we  introduce  some  convenient  order 
notation.  For 


ak  =  o(ka )  means  lim  — -  =  0. 

I*I-k»  \k\a 

This  is  commonly  referred  to  as  the  “little-o”  notion  of  order.  There  is  a  corresponding 
“big-O”  definition, 

ak  =  0(ka )  means  \ak\  <  C  \k\a  , 


for  all  sufficiently  large  \k\,  with  C  independent  of  k.  Note  that  the  little-o  condition 
is  stronger.  The  content  of  the  statement  ak  =  o(ka )  is  that  the  ratio  ak/ka  tends  to 
zero,  while  ak  =  0(ka )  says  only  that  the  ratio  is  bounded. 

Theorem  8.10  For  f  e  Cm( T)  with  m  e  No, 


ke  Z 


<  00, 


(8.39) 
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and 

ck[f]=o(k~m). 

Proof  The  inequality  (8.39)  follows  immediately  from  a  combination  of  Lemma  8.9 
and  and  Bessel’s  inequality  in  the  form  (8.14).  Since  the  terms  in  a  convergent  series 
must  approach  zero, 

lim  kmck[f ]  =  0, 

\k\^oo 


which  gives  the  claimed  decay  estimate.  □ 

Example  8.11  Consider  the  cosine  series  (8.6)  computed  in  Example  8.1  for  the 
function  h(x)  =  3ttx2  —  2x3  on  (0,  tv).  Although  h  e  C°°(0, tv),  the  extension  of  h 
to  T  as  an  even  function  is  merely  C2 .  Theorem  8. 10  thus  implies  that  ^k4  \ck[h]\2  < 
oo. 

The  periodic  Fourier  coefficients  corresponding  to  (8.9)  are 

k  =  0, 
k  odd, 
k  7^  0,  even. 

This  shows  much  faster  decay  than  predicted,  but  not  the  rapid  decay  we  would  have 
seen  if  the  even  periodic  extension  had  been  smooth.  0 

Our  next  goal  is  to  develop  a  converse  to  Theorem  8.10  that  says  that  a  certain 
level  of  decay  rate  of  Fourier  coefficients  guarantees  a  corresponding  level  of  dif¬ 
ferentiability  for  the  function.  In  fact,  the  first  stage  of  this  result  has  already  been 
worked  out.  Suppose  /  e  L2( T)  and  its  coefficients  satisfy 

2>[/]|<oo.  (8.40) 

keZ 


_3 


ck[h]  = 


TC_ 

2  ’ 


24 

nkA 


0, 


We  know  that  £„[/]—>  /  in  the  L 2  sense  by  Theorem  8.6.  By  Theorem  8.5  we  also 
know  that  {Sn  [/]}  converges  uniformly,  and  so  the  limit  is  continuous  by  Femma  8.4. 
Hence  we  can  conclude  that  (8.40)  implies  /  e  C°(T).  Recall  from  Sect. 7.3  that 
when  we  say  an  L 2  function  is  Cm  we  mean  this  only  up  to  equivalence,  i.e.,  the 
original  function  might  require  modification  on  a  set  of  measure  zero  to  make  it  Cm . 

Theorem  8.12  Suppose  f  e  L2( T)  has  Fourier  coefficients  satisfying 


2>'”c4/] 

ke  Z 


<  OO, 


(8.41) 


form  e  No.  Then  f  e  Cm( T). 

Proof  As  remarked  above,  the  m  =  0  case  is  already  taken  care  of  by  Theorems  8.5 
and  8.6. 
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Assume  that  (8.41)  is  satisfied  for  m  =  1.  For  convenience,  let  Ck  :=  Ck[f] 
and  fn  =  Sn[f].  Since  fn  is  a  (finite)  linear  combination  of  smooth  functions,  the 
derivatives  are  given  by 

n 

/„'(*)  =  ^  ikcke,kx. 

k——n 


By  the  m  =  0  result,  the  sequence  { f  w'}  converges  uniformly  to  some  g  c  C°(T), 
Our  goal  is  to  show  that  g  =  /',  which  means 


g(v)  =  lim 

y^O 


fix  +  y)  -  f(x) 


for  every  i  g  T.  To  argue  this  we  will  decompose  the  difference  quotient  as 


fix  +  t)  -  fix) 


~  gix)  = 


fn  jx  +  y)  -  fnjx ) 

y 


-  /„'W 


(8.42) 


+  (fn(x)  -  g(x))  +  Rn(x,  y), 


The  first  term  on  the  right  approaches  zero  by  the  definition  of  ff  and  the  second 
term  approaches  zero  as  n  oo  by  the  construction  of  g.  The  remainder  term  is 


Rn{x,  y)  :=  ^  ck 

\k\>n 


^ Jk(x+y )  _  eikx ^ 


y 


which  converges  absolutely  for  each  y  0  by  (8.41). 
We  can  estimate  the  remainder  by 


R„(x,  j)|  <  ^  |c*| 

\k\>n 


Ay  _  i 

y 


By  noting  that  elky  —  1  =2  sin(Cy/2),  a  simple  calculus  estimate  gives 


Ay  _  i 

y 


<  \k\ 


for  all  y  7^  0.  This  implies  a  uniform  estimate  on  the  remainder  term, 


Rn(x,  y)|  <  ^  \kck\ . 

\k\>n 


(8.43) 


In  particular,  by  the  assumption  (8.41)  the  remainder  term  is  arbitrarily  small  for  n 
large. 
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Fix  x  g  T  and  £  >  0.  By  (8.43)  and  the  fact  that  f„{x)  — >  g(v),  we  can  pick  n 
so  that 


fn(x)  -  g(*)|  <  s  and  | R„(x,  y)|  <  s. 


for  all  y  7^  0.  For  this  n  and  x,  the  definition  of  f'n  (v)  says  that  we  can  choose  8  such 
that  0  <  \y\  <8  implies 


fn(x  +  y)  -  fn(x) 

y 


<  £. 


Applying  these  estimates  to  (8.42)  shows  that  for  0  <  \y\  <  8 , 


fix  +  y)  -  f(x) 

y 


g(x) 


<  3s. 


Since  s  was  arbitrarily  small,  this  shows  that  ff(x)  =  g(x).  And  since  g  is  continuous, 
we  conclude  that  /  e  C^T). 

The  same  argument  can  now  be  repeated  for  higher  derivatives,  assuming  (8.41) 
holds  for  larger  m.  □ 

The  hypothesis  (8.41)  can  be  reformulated  in  terms  of  a  decay  condition  on  the 
coefficients,  although  this  gives  a  slightly  weaker  result.  If  Ck[f]  =  0(k~a ),  then 


X I  *"<*[/] 


<  c^>| 


—a+m 


and  this  series  converges  provided  a  >  m  +  1.  Hence  Theorem  8.12  implies  that 
/  g  Cm  (T)  under  the  condition  that 

Cklf]  =  Oik 


for  some  s  >  0. 

Let  us  finally  return  to  the  one-dimensional  heat  equation  that  motivated  this 
discussion,  first  considering  the  periodic  case. 

Theorem  8.13  For  h  g  C°(T),  the  heat  equation  on  [0,  00)  x  T, 

du  d2u 
dt  dx 2 

admits  a  solution  u  G  C°°((0,  00)  x  T),  defined  for  t  >  0  by 

u(t,x)  :=  ^ck[_h]e-k2,eikx,  (8.44) 

keZ 


and  satisfying 
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lim  u(t,  x)  =  h{x)  (8.45) 


for  each  xgI 

Proof  For  t  >  0,  the  Fourier  coefficients  of  u(t,  •)  decay  exponentially,  and  Theo¬ 
rem  8.12  shows  that  u(t,  •)  g  C°°(T)  for  each  t.  The  same  arguments  used  in  the 
proof  of  that  theorem  apply  to  the  t  derivatives.  To  see  this,  let  un  denote  the  partial 
sum  of  (8.44), 

n 

u„(t,x )  :=  ^  cke~k2'elkx. 

k=—n 


where  c k  :=  cfi/i].  As  a  finite  sum,  this  can  be  differentiated  directly, 


d  un 

~di 


n 

(t,x)=  £(-*2)c*«r*Vfa‘. 

k=—n 


(8.46) 


By  Theorem  8.10  the  Fourier  coefficients  of  h  satisfy  =  o(k  [),  so  that 


<  Cke 


(8.47) 


As  n  — >  oo  the  series  (8.46)  thus  converges  absolutely  for  t  > 
define  a  function 


lim 

oo 


3un 

~dTm 


0,  allowing  us  to 


For  £  >  0,  the  estimate  (8.47)  shows  that  the  convergence  is  uniform  for  t  >  s. 
Lemma  8.4  shows  that  the  limit  is  continuous  for  t  >  s.  Since  s  >  0  is  arbitrary,  this 
implies  g  G  C°((0,  oo)  x  T). 

We  can  argue  that  g  =  du/dt  by  considering 


u{t  +  s,  x)  —  u{t ,  x) 
s 


h(t,  x) 


un(t  +  s,  x) 


s 


un(t,x)  dun 

- g-(L*) 

ot 

h(t,  x)j  +  Rn{t,  s,  x ), 


where 


Rn(t,s,x)  :=  ^cke  k''e'kx 

\k\>n 


At  this  point  the  argument  becomes  essentially  parallel  to  the  analysis  of  (8.42),  so 
we  will  omit  the  details.  The  conclusion  is  that  du/dt  is  continuous  on  (0,  oo)  x  T. 

The  argument  can  be  repeated  for  higher  t  derivatives,  allowing  us  to  conclude 
that  u  g  C°°((0,  oo)  x  T).  Moreover,  the  partial  derivatives  of  un  converge  to  the 
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corresponding  derivatives  of  u ,  pointwise  on  (0,  oo)  x  T,  and  uniformly  if  we  restrict 
to  t  >  s  for  some  £  >  0.  Since  un  satisfies  the  wave  equation  for  each  n  by 
construction,  this  shows  that  u  satisfies  the  wave  equation  also. 

At  the  moment  we  only  have  the  tools  to  prove  (8.45)  under  the  stronger  assump¬ 
tion  that  h  e  C!(T).  In  this  case  we  can  argue  exactly  in  the  proof  of  Theorem  8.5 
that  (8.44)  converges  uniformly  for  (t,  x)  e  [0,  oo)  x  T.  By  Lemma  8.4  this  shows 
u  e  C°([0,  oo)  x  T)  and  we  can  just  set  t  =  0  to  obtain  (8.45). 

If  h  is  merely  continuous,  then  this  approach  breaks  down,  because  the  series 
(8.44)  may  actually  diverge  for  t  =  0.  We  will  cover  the  C°  case  in  Chap.  13,  after 
developing  an  alternate  formula  for  (8.44).  □ 

The  one-dimensional  heat  equation  derived  in  Sect.  6. 1  involved  Dirichlet  or  Neu¬ 
mann  boundary  conditions  on  an  interval  [0,  i ] .  By  rescaling  the  interval  to  [0,  it  ]  and 
then  extending  functions  to  T  with  either  even  or  odd  symmetry,  we  apply  the  results 
for  periodic  Fourier  series  results  to  these  cases.  In  particular,  from  Theorem  8.13 
we  deduce  the  following: 

Corollary  8.14  Suppose  h  e  C[0,  l]  and  satisfies  Dirichlet  or  Neumann  boundary 
conditions.  The  heat  equation  on  [0,  oo)  x  [0,  i ]  admits  a  solution  u  e  C°°((0,  oo)  x 
[0,  i ]),  under  the  same  boundary  condition,  such  that 

lim  u(t ,  x)  =  h(x ) 

t^o 


for  each  x  e  [0,  l\ 

The  solutions  obtained  in  Theorem  8.13  and  Corollary  8.14  are  uniquely  deter¬ 
mined  by  the  initial  condition  h .  (See  Exercise  6.4.)  The  fact  that  solutions  are  smooth 
for  t  >  0,  even  when  h  is  merely  continuous,  is  a  characteristic  property  of  diffu¬ 
sion  equations.  In  fact  the  smoothing  phenomenon  carries  over  to  cases  where  h  is 
L2  but  not  even  continuous.  This  is  illustrated  in  Fig.  8.5  for  the  case  considered  in 
Example  8.2. 

As  these  applications  show,  the  Fourier  series  approach  (and  spectral  analysis 
in  general)  is  well  suited  to  analyzing  the  regularity  of  solutions.  However,  other 
qualitative  features  are  perhaps  obscured  from  this  viewpoint.  For  example,  we  would 
expect  solutions  to  reflect  the  physical  principle  that  heat  flows  from  hot  to  cold.  We 
can  see  this  behavior  quite  clearly  in  the  plots  of  Fig.  8.5,  but  it  is  not  at  all  apparent 
in  the  series  formula  (8.44). 


Fig.  8.5  Solutions  of  the  heat  equation  become  smooth  for  t  >  0 


8.7  Exercises 


151 


8.7  Exercises 


8.1  For  x  e  (0, 7r),  let 


fix)  =  X. 


(a)  Extend  /  to  an  odd  function  on  T  and  compute  the  periodic  Fourier  coeffi¬ 
cients  Ck[f  ]  according  to  (8.12).  (Note  that  the  case  k  =  0  needs  to  be  treated 
separately.)  Show  that  the  periodic  series  reduces  to  a  sine  series  in  this  case. 

(b)  Show  that  the  convergence  of  the  Fourier  series  at  v  =  | ,  which  is  guaranteed 
by  Theorem  8.3,  yields  the  summation  formula 

TV  111 

—  —  1  —  —  T  —  —  —  T  •  ■  •  • 

4  3  5  7 

(c)  Show  the  Parseval  identity  (Corollary  8.7)  leads  to  the  formula 


oo 


z 


1 

k 2 


TV 


2 


6 


8.2  For  v  €  (0,  tv),  let 

g(x)  =  X. 

(a)  Extend  g  to  an  even  function  on  T  and  compute  the  periodic  Fourier  coeffi¬ 
cients  Ck[g]  according  to  (8.12).  (Note  that  the  case  k  =  0  needs  to  be  treated 
separately.)  Show  that  the  periodic  series  reduces  to  a  cosine  series  in  this  case. 

(b)  Show  that  the  convergence  of  the  Fourier  series  at  v  =  0,  which  is  guaranteed 
by  Theorem  8.3,  reproduces  the  formula  (8.37). 

(c)  Show  the  Parseval  identity  (Corollary  8.7)  implies  the  formula 

Y  1  tt4 

^  k4  =  96' 

k  ^Nodd 


8.3  Consider  the  periodic  wave  equation 

d2u  d2u 
dt 2  dx 2 

for  t  e  R  and  jgI  Suppose  the  initial  conditions  are 

du 

u( 0,  x)  =  g(x),  —  (0,  x)  =  h(x), 

ot 


for  g  g  Cm+l(  T)  and  h  e  Cm(T),  for  me  N. 
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(a)  Assuming  that  u(t,  x)  can  be  represented  as  a  Fourier  series 

u(t,x)  =  ^ak(t)eikx,  (8.48) 

keZ 

find  an  expression  for  a^it)  in  terms  of  the  Fourier  coefficients  of  g  and  h. 

(b)  Using  the  assumptions  on  g  and  h ,  together  with  Theorem  8.10,  show  that  the 
coefficients  a^it)  satisfy  an  estimate 

k2m  \cik(t)\2  <  M  <  oo, 

ke  Z 


uniformly  for  t  e  R. 

(c)  By  the  arguments  used  in  Theorem  8.13,  (b)  implies  that  the  series  (8.48)  con¬ 
verges  to  a  solution  u  satisfying  the  initial  conditions.  What  could  you  conclude 
about  the  differentiability  of  ul 

8.4  In  L2(0,  tv)  consider  the  sequence 


iM*)  •= 


[2  . 

,/  —  sin  Ur, 
V  ix 


for  k  e  N. 

(a)  Show  that  {^}  is  an  orthonormal  sequence. 

(b)  Suppose  that  /  e  L2( 0,  tv)  and  (/,  =  0  for  all  k  e  N.  Show  that  /  =  0. 

(Hint:  extend  /  to  an  odd  27r -periodic  function  on  R,  which  can  be  regarded  as 
an  element  of  L2(T).  Then  apply  Theorem  8.6.) 

(c)  Conclude  that  {i/^}  is  an  orthonormal  basis  for  L2(0,  tv). 

8.5  Suppose  that  /  e  L2(—tx ,  n)  satisfies 


xl  f(x)  dx  =  0 


for  all  /  €  No- 

(a)  Show  that  [qm,k,  /)  =  0  for  all  m  G  N  and  k  e  Z,  where 


qm,k{x)  . —  ^  ' 
1=0 


(—ikx)1 

/! 


(b)  Note  that 


lim  2Wi*(x)  =  ^ 

m^oo 


-ikx 


by  the  definition  of  the  complex  exponential.  Show  that  this  convergence  is 
uniform  for  x  e  [— n,  tv]  (with  k  fixed). 
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(c)  Use  (a)  and  (b)  to  show  that  for  /  e  L2(—Tt,  tv). 


e~ikxf(x)dx  =  0 


for  all  k  e  Z. 

(d)  Conclude  from  Theorem  8.6  that  /  =  0.  (In  other  words,  the  monomials 
l,x,x2,  . ..  form  a  basis  for  L2(—Tt ,  tv),  although  not  an  orthonormal  one.) 

8.6  The  Legendre  polynomials  are  functions  of  z  €  [—1,  1]  defined  by 


Pk(z)  := 


1  dk  9 

- (z 

2 kk\  dzk 


for  k  e  No.  (This  corresponds  to  the  case  m  =  0  in  (5.31).)  These  are  solutions  of 
the  eigenvalue  equation 

LPk  =  k(k  +  l)Pk, 


where 


(a)  For  u,  v  e  C2[—  1 ,  1],  check  that  L  satisfies  a  formal  self-adjointness  condition, 


(w,  Lv)l 2  =  (Lu,  v)L2  . 


Conclude  that  the  /y s  with  distinct  values  of  k  are  orthogonal  in  L2(—  1,  1). 

(b)  Use  the  result  of  Exercise  8.5  to  show  that  {/\}  forms  an  orthogonal  basis  for 
L2(—  1,  1).  (The  Pk  are  normalized  by  the  condition  /\(1)  =  0,  rather  than  by 
unit  L 2  norm.) 


Chapter  9 

Maximum  Principles 


We  saw  in  Sect.  4.7  that  conservation  of  energy  can  be  used  to  derive  uniqueness  for 
solutions  of  the  wave  equation.  In  this  chapter  we  will  consider  another  approach  to 
issues  of  uniqueness  and  stability,  based  on  maximum  values.  This  method  applies 
generally  to  elliptic  equations,  which  describe  equilibrium  states,  and  to  parabolic 
equations,  which  are  generally  used  to  model  diffusion. 


9.1  Model  Problem:  The  Laplace  Equation 

As  noted  in  Sect.  5.2,  the  classical  evolution  equations  such  as  the  heat  or  wave 
equation  have  the  form 

Pt  it  —  Au  =  0, 

where  Pt  denotes  some  combination  of  time  derivatives.  In  an  equilibrium  state,  for 
which  the  solution  is  independent  of  time,  these  equations  all  reduce  to 

Au  =  0, 

which  is  called  the  Laplace  equation.  A  solution  of  the  Laplace  equation  is  also  called 
a  harmonic  function.  The  Laplace  equation  on  a  bounded  domain  Q  is  generally 
formulated  with  an  inhomogeneous  Dirichlet  boundary  condition, 


u\ dQ  =  f 


for  /  :  d£2  — >  M. 
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The  Laplace  equation  frequently  appears  in  applications  involving  vector  fields. 
A  conservative  vector  field  v  e  C°(1T2;  M77)  can  be  represented  as  the  gradient  of  a 
potential  function  0  e  Cl(Q\  M), 


v  =  V</>, 

If  the  vector  field  i;  is  also  solenoidal  (V  •  t;  =  0),  then  the  potential  satisfies  the 
Laplace  equation 

A  cj)  =  0. 

In  fluid  dynamics  in  M3,  for  example,  the  velocity  field  is  solenoidal  for  an  incom¬ 
pressible  fluid,  such  as  water,  and  conservative  precisely  when  the  flow  is  irrotational 
(Vxd  =  0). 

Electrostatics  provides  another  important  source  of  Laplace  problems.  In  the 
absence  of  charges,  the  electric  field  E  is  conservative  and  is  commonly  written 
as 

E  =  -Vcj) 

where  </>  is  the  electric  potential.  On  the  other  hand,  Gauss’s  law  of  electrostatics  says 
that  V  •  E  is  proportional  to  the  electric  charge  density.  Hence,  the  electric  potential 
for  a  charge-free  region  satisfies  the  Laplace  equation. 

In  the  remainder  of  this  section  we  will  consider  a  particular  classical  case,  the 
Laplace  problem  on  the  unit  disk.  Circular  symmetry  allows  us  to  solve  the  equation 
explicitly  using  Fourier  series,  and  the  resulting  formula  gives  some  insight  into  the 
general  behavior  of  harmonic  functions. 

Let  D  denote  the  open  unit  disk  in  M2.  Given  g  e  C°(c®3)),  our  goal  is  to  solve 

Au  =  0,  u\dB>  =  g.  (9.1) 

In  Sect.  5.3  we  used  separation  of  variables  in  polar  coordinates  to  find  the  family  of 
harmonic  functions, 

Mr,  0)  :=  Meik\ 

for  k  e  Z.  The  boundary  5D  is  naturally  identified  with  the  space  T  :=  M/27rZ 
introduced  in  Sect.  8.2,  parametrized  by  0. 

Consider  the  periodic  Fourier  series  expansion, 

g(0)  =  (9-V 

keZ 


where 


1 


*27 r 


e  lke g{6)  dO. 


cdg]  := 


2tt 
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Given  that 

eikf)  =  0k{\,Q), 


we  might  hope  to  construct  a  solution  of  (9.1)  by  setting 

u(r,  9 )  =  y'cklgWkjr,  9).  (9.3) 

keZ 


Theorem  8.10  shows  that  the  sequence  {c^[g]}  is  bounded  for  g  continuous.  Note 
also  that 

\4>k(r,  9) |  =  r]k] 

and 

<  oo 

keZ 


for  r  <  1  by  geometric  series.  This  implies  that  (9.3)  converges  absolutely  for  r  <  1. 
In  fact  the  convergence  is  uniform  on  {r  <  R}  for  R  <  1. 

We  can  write  u(r ,  0)  more  explicitly  by  substituting  the  definition  of  Ck[g]  into 
the  integral, 


V  \k\ gik(6  V)  g(jj)  dr\. 


Uniform  convergence  in  0  for  r  <  1  allows  us  to  move  the  sum  inside  the  integral, 
yielding  the  formula 


u(r,  0)  =  J-  [  pr(0  -  rj)g{rf)  dr 7,  (9.4) 

2tt  J o 

where 

Pr(9)  :=^rweik0.  (9.5) 

ke  Z 

This  function  is  called  the  Poisson  kernel.  Its  behavior  as  r  — >  1  is  illustrated  in 
Fig.  9.1. 

Summing  by  geometric  series  gives  the  formula 


oo 


oo 


Pr(6)  =  1  +  ^(rew)k  +  re~wy 


k=  1 


fc=l 


=  1  + 


i6 


+ 


-w 


1  —  re10  1  —  re 
1  -r2 


(9.6) 


1  —  2r  cos  0  +  r2 
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e 


Fig.  9.1  The  Poisson  kernel  Pr  (0)  for  a  succession  of  radii 


From  the  series  formula  (9.5)  we  can  also  deduce  directly  that 


i  f2w 

—  Pr(0)d0=  1, 

2tt  J0 


(9.7) 


since  the  only  nonzero  contribution  comes  from  the  term  k  =  0. 

By  periodicity,  a  change  of  variables  rj  ->  6  —  rj  in  (9.4)  gives  the  alternate  form 


u(r,9)  =  ff  Pr{rj)g{6  —  rj)  dr).  (9.8) 

2tt  Jo 

In  view  of  (9.7),  this  could  be  interpreted  as  a  weighted  average  of  /  with  a  weight 
function  that  depends  on  r.  As  r  1“  this  weight  function  becomes  concentrated 
at  0,  as  Fig.  9.1  demonstrates.  This  is  the  mechanism  by  which  we  expect  to  have 
u(r ,  6)  — >  g{6)  as  r  l-. 

Theorem  9.1  For  f  e  C°(9D),  the  Laplace  equation, 


A u  —  0  in  D,  u\q b  =  g, 

admits  a  classical  solution  u  e  C°°(D)  fi  C°(0)  given  by  the  Poisson  integral  (9.4). 
Proof  The  function  Pr  (6)  is  smooth  for  r  <  1,  and  it  follows  from  (9.5)  that 


APr(0)  =0, 


where  Pr  ( 0 )  is  interpreted  as  a  function  on  D  written  in  polar  coordinates.  By  passing 
derivatives  inside  the  integral,  we  can  deduce  from  (9.4)  that  u  e  C°°(D)  and 
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A  u  =  0. 


To  complete  the  proof  we  need  to  check  that 


lim  u(r,  0)  =  g(9)  (9.9) 


for  every  9  e  50,  which  will  also  show  that  u  e  C°(B).  Note  that  (9.9)  is  not  the 
same  as  claiming  that  the  Fourier  series  for  g  converges,  which  is  not  necessarily 
true.  The  difference  lies  in  the  order  of  the  limits.  In  (9.9)  we  take  the  limit  of  the 
Fourier  series  first  for  r  <  1,  and  then  the  limit  r  ->  l-.  This  limit  exists,  as  we  will 
see  below,  but  if  we  first  set  r  =  1  in  (9.3)  then  the  sum  over  k  may  diverge. 

By  (9.7)  and  (9.8)  we  can  write 


u(r,  9)  -  g(6) 


1 

2tt 


Pr(V) 


g(9  -  17)  -  g(9) 


dr]. 


(9.10) 


The  goal  is  to  estimate  the  left-hand  side  for  r  close  to  1.  Fix  9  e  ID)  and  let  £  >  0. 
Since  g  is  continuous,  there  exists  5  >  0  so  that 


\8(0-ri)-gm<£  (9.11) 

for  1 77 1  <  5.  For  \r]\  >  S  we  can  estimate 

max  Pr{rf)  =  Pr(S).  (9.12) 

5<\i]\<7r 


Thus,  splitting  the  integral  (9.10)  at  \rj\  =  5  gives 


\u(r,  9)  -  g(9)\  < 


1 


■5 


27T  J_s 

1 


p,  (v) 


8 (9  -  r?)  -  g(9) 


dr] 


+ 


2tt 


Pr(V ) 


5<\r]\<7T 


< 


£ 


>5 


Pr(r])dr]  + 

2i r  s  2?r 


g(9  -rj)-  g(9 ) 

Pr(S) 


dr] 


5<\ll\<7T 


g(9  -7])-  g(8) 


dr]. 


By  (9.7)  and  the  fact  that  Pr  >  0, 


1 


•<s 


2tt 


Pr(j])  dr]  <  1 


Furthermore,  since  g  is  continuous,  \g{9  —  rf)  —  g(9)\  is  bounded  by  some  constant 
M  for  all  9  and  77.  This  reduces  the  bound  to 


u(r,  0)  -  g(9)\  <  £  +  M Pr (5) . 
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We  can  now  use  the  fact  that 


lim  Pr(5)  =  0 

r—>  l- 


to  choose  R  <  1  so  that 

MPr(5)  <  e 


for  R  <  r  <  1 .  We  conclude  that 


u(r,  0)  -  g(0) |  <  2e 


for  7?  <  r  <  1 .  Since  e  was  arbitrary,  this  shows 

lim  | u(r,  6)  —  g(0)\  =  0. 


□ 

For  students  who  know  some  complex  analysis,  we  note  that  the  formula  (9.4) 
could  be  deduced  from  the  Cauchy  integral  formula,  because  any  harmonic  function 
on  D  is  the  real  part  of  a  holomorphic  function. 

Example  9.2  For  0  <  a  <  7r,  suppose  the  boundary  function  is  given  by 


\e\  <  a, 

a  <  \0\  <  7 r, 


as  shown  in  Fig.  9.2. 

This  boundary  condition  could  represent,  for  example,  a  hot  spot  at  one  point  on 
the  edge  of  a  metal  plate.  The  corresponding  equilibrium  temperature  distribution 
within  the  plate  is  given  by  calculating  the  Fourier  coefficients  of  g  and  substituting 
into  (9.3).  The  resulting  solution, 


u(r,  6) 


a 


2tt 


oo 


+ 


-z 

7 ra 

k= l 


1  —  cos  (ka) 
k 2 


cos  (kO), 


is  illustrated  in  Fig.  9.3. 


0 


Fig.  9.2  Boundary  function 
with  a  triangular  peak 


9.2  Mean  Value  Formula 


161 


Fig.  9.3  Contour  plot  of  the 
harmonic  function  from 
Example  9.2 


9.2  Mean  Value  Formula 

Setting  r  =  0  in  the  Poisson  formula  (9.4)  gives 

w(0)  =  -!-  f  g(0)  d6,  (9.13) 

2tt  J o 

because  Po(6)  =  1.  In  other  words,  the  value  of  a  harmonic  function  at  the  center  of 
the  disk  is  equal  to  its  average  value  on  the  boundary.  This  phenomenon  is  illustrated 
in  Fig.  9.4.  In  this  section  we  will  extend  (9.13)  to  an  averaging  formula  that  works 
in  any  dimension. 

The  ( n  —  1) -dimensional  volume  of  a  sphere  of  radius  r  is 

vol[<9£(x0;  >')]  =  Anrn~\  (9.14) 

where  An  denotes  the  volume  of  the  unit  sphere  in  R",  as  defined  in  (2.13).  It  follows 
from  the  radial  integral  formula  (2.10)  that 


Fig.  9.4  Mean  value 
property  of  a  harmonic 
function 
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A.  r^ 

vol[Z?(xo;  r)]  =  — - — .  (9.15) 

n 

To  state  the  mean  value  formula  for  a  ball  of  radius  R ,  we  introduce  the  family  of 
radial  functions, 


Gr(x) 


[sFto(*)’ 

_ 1 _ [_!_ 

(n—2)An  Lr«-2 


1 

y.n—2 


]. 


n  —  2, 
n  >  3. 


The  function  Gr  is  the  unique  solution  of  the  equations 


OGr  =  1 

dr  Anrn~l ’ 


(9.16) 


(9.17) 


Note  that  GR  is  integrable  on  Z?(0;  R),  despite  the  singularity  at  the  origin,  because 
the  radial  volume  element  is  Anrn~l  dr  by  (2.10). 

Theorem  9.3  (Mean  value  formula)  Assume  that  u  e  C2 (12)  on  a  domain  Q  C  R” 
with  n  >  2.  For  R  >  0  swc/z  that  B(x o;  7?)  C  *$2, 


m^o)  = 


1 


/7  —  1 


u(x)dS  +  /  G#(x 
Jb(x0\R) 


Xq)  A  u(x)dnx. 


Proof  By  a  change  of  variables,  it  suffices  to  consider  the  case  Xo  =  0.  The  formula 
(2.15)  for  the  radial  component  of  the  Laplacian  implies  that 


A  G  r  (x)  =  0 


for  x  ^0.  For  e  >  0,  we  can  therefore  apply  Green’s  second  identity  (Theorem  2. 1 1) 
on  the  domain  {e  <  r  <  R]  to  obtain 


GrAu d  x 


{s<r<R} 


(9.18) 


Because  Gr  is  integrable  on  B( 0;  R ),  on  the  left-hand  side  of  (9.18)  we  can  take 
£  ->  0  to  obtain 


lim  /  GRAudnx=  /  GRAudnx . 

£^°  J{e<r<R}  JB(0;R) 


(9.19) 


By  (9.17),  the  first  term  on  the  right  in  (9.18)  reduces  to 


(■ 


du  dGR\  1 

Gr  — —  -u—22-  )</£=- 
{r=/?}  \  dr  dr  I  A„R 


n  —  1  , 

J  {r=R} 


u  dS. 


(9.20) 
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The  second  term  on  the  right  in  (9.18)  is 


1 


u  dS. 


The  first  of  these  integrals  can  be  estimated  by  noting  that  du /dr  is  a  directional 
derivative  and  thus  bounded  by  the  magnitude  of  |  Vm|.  By  the  assumption  that  u  e 
C2(£2),\du/dr\is  therefore  bounded  by  a  constant  C  for  r  <  R ,  yielding  the  estimate 


f  du 

/  irds 

J  {r—s} 


<  C Ane 


n  —  1 


Since  the  divergent  term  in  Gr  (e)  as  e  — >  0  is  proportional  to  £2  77  for  n  >  3  and 
log  sforn  =  2,  this  implies 


lim 

c^O 


Hence 


lim 

c^O 


du 

dr 


dS  =  lim 


1 


The  term  in  brackets  is  the  average  of  u  over  a  sphere  of  radius  e.  Since  u  is  continuous, 
this  average  approaches  u( 0)  as  e  — >  0,  so  that 


f  (  du  dGR\ 

lim  /  (  Gr— - u— - —  )  dS  =  u( 0) 

^0  J{r=s}  \  dr  dr  J 

Applying  (9.19),  (9.20),  and  (9.21)  to  (9.18)  gives 


GrAu  dnx  =  u( 0)  — 


1 


B(0\R) 


AnR 


n  —  1 


u  dS, 


dB(0;R) 


(9.21) 


which  completes  the  proof.  □ 

For  harmonic  functions,  Theorem  9.3  gives  a  generalization  of  the  circle  formula 
(Theorem  9.3)  to  spherical  averages  in  higher  dimensions.  As  we  will  now  show,  the 
mean  value  property  can  be  stated  in  a  equivalent  form  in  terms  of  averages  over  a 
ball. 


Corollary  9.4  (Mean  value  for  harmonic  functions)  Suppose  Q  C  M77  for  n  >  2. 
For  u  e  C2(£2)  the  following  properties  are  equivalent: 
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(A)  The  function  u  is  harmonic  on  Q. 

(B)  For  B(xo\  R)  C  T2, 


u(x  o)  = 


1 


AnRn~l 


u  dS. 


dB(x0-,R ) 


(C)  For B(xq;  R)  C  F2, 


U(X0)  = 


n 


AnRn  J  B(x0;R) 


u  dnx, 


Proof  The  fact  that  (A)  implies  (B)  follows  immediately  by  setting  A u  =  0  in  the 
formula  of  Theorem  9.3. 

To  see  that  (B)  and  (C)  are  equivalent,  fix  some  x$  a  T2  and  define 


h(r) 


B(xQ\r) 


u  dnx , 


for  r  >  0  such  thatZ?(xo;  r)  C  F2 .  As  we  saw  in  Exercise  2.4,  the  derivative  of  h(r) 
is  given  by  a  surface  integral 


h\r) 


u  dS. 


dB(x0;r ) 


Hence  property  (B)  says  that 


h'(r)  =  Anrn  1u(x0), 


while  property  (C)  says  that 


h(r)  = 


Anrn 

n 


u(x0). 


Since  h( 0)  =0  by  definition,  these  are  two  statements  are  equivalent. 

Finally,  we  need  to  show  that  (B)  implies  (A).  Assuming  that  (B)  holds,  Theo¬ 
rem  9.3  gives 

Gr(x  —  xo)Au(x)dnx  =  0  (9.22) 


B(x0;R ) 


provided B (x0;  R)  C  F2.  Suppose  Au(xo)  <  0  for  some  xq  e  F2 .  Then  by  continuity 
there  exists  some  e  >  0  and  S  >  0  such  that  A u  <  —  e  on  B(x o;  (5).  Since  Gr  is 
strictly  negative  and  decreasing  as  r  — >  0,  this  implies 


Gr(x  —  Xq)A u(x)dnx  >  —sGr  r_s  >  0, 


B(x0;6 ) 
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which  contradicts  (9.22).  The  same  argument  applies  if  Au(xo)  >  0.  We  thus  con¬ 
clude  that  (B)  implies  Au  =  0.  □ 


9.3  Strong  Principle  for  Subharmonic  Functions 

A  real-valued  C2  function  that  satisfies 


—  Au  <  0 


is  called  subharmonic.  The  case  —  Au  >  0  is  similarly  called  superharmonic.  We 
will  focus  on  the  subharmonic  case.  The  results  can  easily  be  translated  to  the  super¬ 
harmonic  case  by  replacing  u  by  —  u. 

The  “weak”  maximum  principle  says  that  for  a  subharmonic  function  the  max¬ 
imum  value  occurs  at  a  boundary  point.  We  will  prove  here  a  “strong”  version  of 
this  principle,  which  says  furthermore  that  if  that  the  global  maximum  occurs  at  an 
interior  point  then  the  function  is  constant. 

Theorem  9.5  (Strong  maximum  principle)  Let  Q  C  R”  be  a  bounded  domain.  If 
u  G  C2(Q;  R)  n  C°(Q)  is  subharmonic  then 

max  u  =  max  u . 

Q  dQ 

The  maximum  is  attained  at  an  interior  point  only  ifu  is  a  constant  function. 

Proof  By  the  extreme  value  theorem  (Theorem  A. 2),  u  achieves  a  global  maximum 
at  some  point  xo  G  Q.  If  xo  G  dQ  then  the  claimed  equality  clearly  holds.  The  goal 
is  thus  to  show  that  Xo  G  Q  implies  that  u  is  constant. 

Because  Q  is  open,  an  interior  point  Xo  has  a  neighborhood  contained  in  Q.  We 
may  thus  assume  thatZ?(xo;  R)  C  Q  for  some  R  >  0.  Applying  Theorem  9.3  to  this 
ball  gives 


u(x  0)  = 


1 


AnRn  1  J  dB(xo\R) 


u(x)dS  +  /  Gr(x  —  Xo)Au(x)dnx 

B(x0;R ) 


By  the  definition  (9.16),  Gr  <  0  for  0  <  r  <  R.  Therefore,  since  Au  >  0  by 
assumption, 


u(x0)  < 


1 


AnR»~l 


^(x)^. 


(9.23) 


dB(x0-R ) 


Using  (9.14),  we  can  subtract  m(xq)  from  both  sides  to  obtain 


1 


AnRn  1  JdB(x0-,R ) 


[u(x)  —  u(x o)]  dS  >  0. 


(9.24) 
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By  assumption  u(xq)  is  the  global  maximum  of  u ,  implying  that  the  integrand  of 
(9.24)  is  nonpositive.  The  inequality  therefore  shows  that  the  integrand  vanishes,  and 
we  conclude  that  u(x)  =  u(x o)  on  dB(x o;  R). 

Note  that  the  same  argument  works  for  every  radius  r  <  R,  so  this  argument 
shows  that  u  =  u(x o)  on  all  of  B(x o;  R). 

To  extend  the  conclusion  to  the  full  domain,  let  M  denote  the  maximum  value  of 
u  on  £2.  We  can  write  £2  as  a  disjoint  union  E  U  F,  where 

E  :=  {x  e  £2;  u(x)  <  M } , 

F  :=  {x  e  £2;  u(x)  =  M] . 

By  the  argument  given  above,  a  point  x  e  F  has  a  neighborhood  B(x\  R)  c  £2  on 
which  u  is  equal  to  M.  Hence  F  is  open. 

On  the  other  hand,  for  x  e  E  we  can  set  e  =  M  —  u(x)  and  use  the  continuity  of 
u  to  find  a  5  >  0  such  that 

| u(x)  —  u(y)  |  <  e 

for  y  6  B(x;  S).  This  implies  in  particular  that  u(y)  <  M,  so  that  B(x;  S)  e  E. 
Thus  E  is  open  also. 

Recall  from  Sect.  2.3  that  the  fact  that  Q  is  connected  means  that  the  domain 
cannot  be  written  as  a  disjoint  union  of  nonempty  open  sets.  Since  Q  =  E  U  F  with 
E  and  E  both  open,  one  of  the  two  sets  is  empty.  If  E  is  empty  then  u  is  constant  on 
^2,  while  if  F  is  empty  then  the  maximum  of  u  is  not  attained  in  the  interior.  □ 

For  a  superharmonic  function  u  e  C2(^2;  R)  n  C°(i2),  reversing  the  sign  yields 
a  minimum  principle, 

min  u  =  min  u . 

Q  dtt 

Both  principles  apply  to  a  harmonic  function  u,  which  therefore  satisfies 

min  u  <  u(x)  <  ma xu 

d£2  d£2 


for  all  x  e  £2. 

The  maximum  principle  implies  the  following  stability  result  for  the  Laplace 
equation. 

Corollary  9.6  Suppose  that  U\,U2  €  C2(£2)  n  C°(£2)  are  solutions  of  the  Laplace 
equation  Au  =  0  with  boundary  values 

U\\dQ  =  gi,  u2\dn  =  g2, 


for  gi,  g2  e  C°(d£2).  Then 


max  | u2  —  u i|  <  max  \g2  —  gi 


(9.25) 
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In  particular,  a  solution  the  Laplace  equation  is  uniquely  determined  by  its  boundary 
data. 

Proof  By  superposition,  U2  —  u\  is  a  harmonic  function  with  boundary  data  g2  —  g\. 
Theorem  9.5  applies  to  db  Refe  —  ^i)  as  well  as  db  lm(u2  —  u\).  Combining  these 
estimates  yields  the  inequality  (9.25).  □ 

Note  that  uniqueness  of  solutions  of  the  Laplace  equation  also  follows  directly 
from  Green’s  first  identity  (Theorem  2.10),  in  the  case  where  Q  has  piecewise  C1 
boundary  and  u  e  C2(L2\  R).  If  A u  =  0,  then  setting  v  =  u  in  Green’s  formula 
gives 

[  \\Vu\\2  dnx  =  [  u^-dS. 

JQ  JdQ  OV 


Thus  if  u  =  0  on  dL2 ,  then 

[  \\Wu\\2  dnx  =  0. 

J  Q 

Since  the  integrand  is  positive,  this  implies  S7u  =  0,  so  that  u  is  constant.  The 
assumption  u\qq  =  0  then  gives  u  =  0  on  the  full  domain.  (This  is  the  “energy 
method”  argument,  as  introduced  in  Sect.  4.7.) 

One  advantage  that  maximum  principle  has  over  the  energy  method  is  the  explicit 
stability  formula  (9.25).  In  terms  of  the  L°°  norm  introduced  in  Sect.  7.3,  this  inequal¬ 
ity  could  be  written 

\\U2  ~  MiHoo  <  \\g2  ~  gllloo  • 

This  is  an  explicit  formulation  of  the  continuity  requirement  for  well-posedness: 
a  small  change  in  boundary  data  results  in  a  correspondingly  small  change  in  the 
solution. 


9.4  Weak  Principle  for  Elliptic  Equations 


Although  the  mean  value  formula  gives  a  direct  proof  of  the  strong  maximum  princi¬ 
ple,  this  approach  applies  only  to  the  Laplacian  itself.  In  this  section  we  will  present 
an  alternative  approach  that  generalizes  quite  easily  to  operators  with  variable  coef¬ 
ficients. 

On  a  domain  Q  c  R”  let  us  consider  a  second  order  elliptic  operator  of  the  form 


l  =  aij(x) 

Uj= i 


d2 

dxi  dxj 


+  ^bj(x) 

7  =  1 


d 

dxj’ 


(9.26) 


where  the  coefficients  aq  and  bj  are  continuous  functions  on  Q .  As  defined  in  Sect. 
1.3,  ellipticity  means  that  the  symmetric  matrix  [aq]  is  positive  definite  at  each  point. 
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For  the  maximum  principle  we  need  a  stronger  assumption,  called  uniform  ellip- 
ticity ,  that  says  that  for  some  fixed  constant  ft  >  0, 

n 

^  ciij(x)viVj  >  ft  ||  r  || 2  (9.27) 

U=i 

for  all  x  g  and  v  g  R”.  An  equivalent  way  to  say  this  is  that  the  smallest  eigenvalue 
of  [atJ  ]  is  bounded  below  by  ft  at  each  point  x. 

Theorem  9.7  (Weak  maximum  principle)  Suppose  Q  C  M77  is  bounded,  and  L  is 
an  operator  of  the  form  (9.26)  satisfying  the  uniform  ellipticity  condition  (9.27).  If 
u  g  C2(^2;  R)  D  C°(f2)  satisfies 

Lu  <  0 


m  ^2, 


max  ft  =  max  ft . 

Q  dC2 


Proof  For  the  moment  let  ft  be  a  general  function  in  C2(^2;  R).  Suppose  that  u  has 
a  local  maximum  at  Xo  G  i2.  The  first  partial  derivatives  of  u  vanish  at  a  local 
maximum,  so  that 


Lu(x  o) 


y.  Gij(Xo) 
ij= 1 


(9v;-  Ovy 


(^o). 


(9.28) 


Furthermore,  we  claim  that  the  matrix  of  second  partials  of  ft  is  negative  definite  at 
Xq,  meaning 


U=i 


<9 2  ft 


(Xo)ViVj 


<  0 


for  v  g  M77.  To  see  this,  set  h(t)  :=  u(x o  +  tv)  and  note  that  h  has  a  local  maximum 
at  t  =  0,  implying  /F7(0)  <  0.  Evaluating  h'\ 0)  yields  the  inequality  stated  above. 

The  right-hand  side  of  (9.28)  could  be  written  as  tr (AB)  where  A  and  B  are  the 
positive  symmetric  matrices 


A  =  [a,7  (xo)] , 


By  switching  to  a  basis  in  which  A  is  diagonal,  tr  (AB)  can  be  written  in  terms  of 
the  eigenvalues  [Xj]  of  A  as 


n 

tr  (AB)  =  y  Xjbjj. 
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If  the  eigenvalues  are  ordered  Ai  <  •  •  •  <  A„,  then 


tr (AB)  >  Ai  tr  B. 


The  positivity  of  B  implies  tr  B  >  0,  so  we  conclude  that 


Lu(x o)  >  0. 


Thus  argument  shows  that  Lu(x o)  >  0  for  xo  a  local  interior  maximum.  Therefore, 
the  strict  inequality  Lu  <  0  implies  that  u  cannot  have  a  local  interior  maximum  and 
that 

ma  xu  =  ma  xu.  (9.29) 

£2  d£2 

To  complete  the  proof,  we  must  relax  the  hypothesis  to  Lu  <  0.  The  strategy  is 
to  perturb  u  slightly  to  reduce  to  the  previous  case.  For  M  >  0,  let 

h(x)  eMx'. 

By  the  definition  of  L, 

Lh  =  [~anM2  +  blM]h. 

The  ellipticity  condition  (9.27)  implies  that  an  >  k,  so  by  choosing 

1 

M  >  —  max  b  \ , 

K  Q 

we  can  guarantee  that 

Lh  <  0. 


If  we  now  assume  now  that  u  satisfies  the  hypothesis  Lu  <  0,  then 


L(u  +  eh)  <  0 


for  e  >  0.  Applying  (9.29)  to  u  +  eh  gives 


max(w  +  eh)  =  max(w  +  eh).  (9.30) 

Q  df2 

Since  h  >  0,  clearly 

max  u  <  max(//  +  eh). 

£2  £2 

On  the  other  hand,  since  Q  is  bounded,  we  may  assume  x\  <  R  in  Q,  for  some  R 
sufficiently  large.  This  implies 

h  <  eMR , 
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so  that 


max(w  +  eh)  <  max  u  +  eeMR. 

d£2  d£2 


From  (9.30)  we  therefore  conclude  that 


max  u  <  max  u  +  eeMR 

Q  d£2 


for  all  e  >  0.  Since  M  and  R  are  independent  of  e,  we  can  take  e  — >  0  to  conclude 
that 


max  m  <  max  u , 


and  the  result  follows. 


□ 


The  maximum  principle  implies  in  particular  the  only  solution  of  Lu  =0  with 
u\q£2  =  0  is  u  =  0.  Hence  a  solution  of  the  equation 


Lu  =  /,  Mlatf  =  g, 


is  uniquely  determined  by  /  and  g  if  it  exists. 


9.5  Application  to  the  Heat  Equation 

Fourier’s  law  of  heat  conduction,  as  introduced  in  Sect.  6.1,  suggests  a  maximum 
principle  for  solutions  of  the  heat  equation.  Because  heat  flows  away  from  a  spatial 
maximum  of  the  temperature,  a  local  spatial  maximum  of  the  temperature  should  be 
impossible  at  time  t  >  0.  The  global  maximum  of  the  temperature  therefore  must 
occur  either  at  t  =  0  or  on  the  boundary. 

Although  it  is  possible  to  prove  the  maximum  principle  via  a  mean  value  formula 
as  in  the  proof  of  Theorem  9.5,  in  this  section  we  will  follow  the  more  direct  approach 
from  Sect.  9.4,  which  has  the  advantage  of  generalizing  to  operators  with  variable 
coefficients. 

Because  the  heat  equation  is  second  order  with  respect  to  spatial  variables  and  first 
order  in  the  time  variable,  it  makes  sense  to  define  a  domain  for  classical  solutions 
that  takes  this  structure  into  account.  For  £2  C  W1 ,  define 


Note  that  this  definition  includes  only  real- valued  functions. 

Theorem  9.8  Suppose  Q  C  W1  is  a  bounded  domain  and  u  e  Cheat(f2)  satisfies 


du 

— - A  u  <  0 

dt 


(9.31) 
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on  (0,  T)  x  Q.  Then  the  maximum  value  of  u  within  [0,  T]  x  Q  occurs  at  a  point 
(to,  jto)  with  either  to  =  0  or  xo  £  d£2. 

Proof  Suppose  that  u  attains  a  maximum  at  (to,  xo)  C  (0,  T)  x  Q .  By  the  same 
calculus  argument  used  in  Sect.  9.4,  this  implies 

du 

-—(to,Xo)  =  0,  (9.32) 

at 


as  well  as 

du  d^  u 

- — (^o,Xo)  =  0,  — xo)  <  0.  (9.33) 

dxj  dx f 

In  particular, 

(^o>*o)  >  0-  (9.34) 

If  (9.31)  were  a  strict  inequality  this  would  complete  the  proof. 

To  proceed  we  use  a  perturbation  strategy  as  in  the  proof  of  Theorem  9.7.  For 
e  >  0,  set 

U£  !  —  U  H-  £  |x  | 

r\ 

Because  A  \x\  =  2 n,  the  hypothesis  on  u  gives 

du£  du 

— —  —  A u£  =  — - A u  —  2ns  <  0.  (9.35) 

dt  dt 

The  existence  of  a  local  maximum  for  u£  within  (0,  T)  x  Q  is  ruled  out  by  (9.34). 
We  conclude  that  u£  attains  a  global  maximum  at  a  boundary  point  of  [0,  T]  x  Q. 
Let  us  label  this  point  (t£,  x£),  so  that 


du 

— - A  u 

dt 


ma  x_u£  =  u£(t£,x£).  (9.36) 

[0  ,T]xG 


Since  (t£,  x£)  is  on  the  boundary,  either  t£  =  0,  t£  =  T ,  or  x£  e  d£2 . 

Suppose  that  t£  =  T  and  x£  e  T2.  Then  u£(t,x£)  <  u£(T,x£)  for  t  e  [0,  T], 
implying  that 


due 

dt 


(T,xe)>  0. 


By  (9.35),  this  implies  also  that  A u£(T,  x£)  >  0,  which  is  ruled  out  by  (9.33).  Hence 
t£  7^  T  if  x£  e  T2 . 

Therefore  (t£,  x£)  lies  in  the  set 


r  :=  ({0}  X  £2)  H  ([0,  T]  X  d Q) . 


(9.37) 
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Let  R  be  sufficiently  large  so  that  Q  c  #(0;  R ).  This  means  that  \x\  <  R  on  Q,  so 
the  inequality 

u  <  us  <  u  +  eR 

holds  at  every  point  in  [0,  T]  x  Q.  From  (9.36)  we  can  thus  conclude  that 


ma x_u  <  u£(ts,  x£) 

[0,T]xC2 

£  u(t£ ,  x  f)  T-  e R^ . 


(9.38) 


This  implies  that 


max_  u  <  max  w  +  eT?2, 
[o,r]x^  r 


because  (t£,  x£)  e  r.  Since  this  inequality  holds  for  every  e  >  0,  this  proves 


max_  u  <  max  u . 
[0,r]xr?  r 


□ 

For  a  solution  of  the  heat  equation,  both  ±u  satisfy  the  hypothesis  of  Theorem  9.8, 
which  implies  that 

min  u  <  u(t,  x )  <  max  u, 
r  r 

for  (t,  x )  e  (0,  T)  x  Q,  where  r  is  defined  by  (9.37).  In  particular  this  yields  the 
following: 

Corollary  9.9  Let  Q  e  R”  be  a  bounded  domain.  A  solution  of  the  heat  equation 
u  e  Cheat(^2)  is  uniquely  determined  by  u\on  and  u\t=o. 

The  same  arguments  could  be  applied  to  the  more  general  parabolic  equation 

du 

- - Lu  =  0, 

dt 

where  L  is  a  uniformly  elliptic  operator  as  defined  in  (9.27). 

In  Sect.  6.3,  we  stated  without  proof  a  uniqueness  result  for  solutions  of  the  heat 
equation  on  W1 .  We  now  have  the  means  to  prove  this,  by  establishing  a  maximum 
principle  for  W1  as  a  corollary  of  Theorem  9.8. 

Corollary  9.10  Suppose  that  u  is  a  classical  solution  of  the  heat  equation 

du 

— - Au  =  0,  u\t=o  =  g,  (9.39) 


on  [0,  oo )  x  W1,  and  that  u  is  bounded  on  [0,  T]  x  M,?  for  T  >  0.  Then 
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max  w<maxg.  (9.40) 

[0,oo)xRn  R'7 

Proof  Assume  that  u  satisfies  (9.39)  and  also 

u(t ,  x)  <  M 

for  t  e  [0,  T ]  and  x  e  M77.  For  y  e  M77  and  e  >  0  set 

_  n  \x-y\2 

v(t,  x)  :=  u(t,  x)  —  e(T  —  t)  2e4(r-t). 

The  £  term  resembles  the  heat  kernel  defined  by  (6.16),  except  that  the  sign  in  the 
exponential  is  reversed.  Direct  differentiation  shows  that  this  expression  satisfies  the 
heat  equation  on  (0 ,  T)  xl”,  and  hence  v  does  also. 

For  R  >  0,  let  us  apply  the  maximum  principle  of  Theorem  9.8  to  v  on  the  domain 
(0,  T)  x  B(y  \  R).  By  construction, 

v(0,x)  <  g(x), 

and  for  a:  e  dB(y;  R ), 

_  n  R 2 

v(t,  x)  <  M  —  s(T  —  t)  ^  e 4(T—t) 

<  M  —  eT~^eR2/4T . 

With  T  fixed,  the  right-hand  side  of  this  second  inequality  is  arbitrarily  negative  for 
large  R.  Therefore,  for  sufficiently  large  R ,  Theorem  9.8  implies  that 

max  v  <  max  g. 

[0  ,T]xB(y;R)  B(y;R) 

In  particular,  setting  x  =  y  in  this  inequality  gives 

v(t,  y)  <  maxg. 

R" 

for  t  e  [0,  T ]  and  y  e  M77.  By  the  definition  of  v,  this  implies  that 

u(t ,  y)  <  maxg  +  e(T  —  t) . 

R” 

We  can  now  take  e  — >  0  and  T  — >  oo  to  conclude  that 

u(t ,  y)  <  maxg 

Rn 


for  all  t  e  [0,  oo)  and  y  e  M77. 


□ 
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The  argument  given  here  can  be  refined  to  show  that  conclusion  (9.40)  holds  under 
the  weaker  growth  condition 

u(t,  x )  <  Mec |x| 


for  t  6  [0,  T]. 

Corollary  9.10  implies  Theorem  6.3  by  the  argument  used  in  Corollary  9.9.  That 
is,  if  u\  and  u 2  are  bounded  solutions  of  (6.19),  then  ±(u  1  —  U2)  solves  (6.19)  with 
g  =  0.  It  then  follows  from  (9.40)  that  u\  =  u 2. 


9.6  Exercises 


9.1  Suppose  that  u ,  </>  e  C2  (£? ;  R)  fl  C°  (f2)  on  a  bounded  domain  Q  c  M77 .  Assume 
that  u  subharmonic  and  </>  harmonic,  with  matching  boundary  values: 


u\dq  =  <j>\dn- 


Show  that 


u  <  (j) 


at  all  points  of  Q .  (This  is  the  motivation  for  the  term  “subharmonic”.) 

9.2  Liouville ’s  theorem  says  that  a  bounded  harmonic  function  on  M77  is  constant. 
To  show  this,  assume  u  e  C2(M77)  is  harmonic  and  satisfies 

| u(x)\  <  M 


for  all  x  gR77. 


(a)  For  x0  6  M77,  set  ro  =  |xo|.  Use  Corollary  9.4  at  the  centers  0  and  xo  to  show 
that 

n 


u(  0)  —  u(xq)  = 


AnRn  \_JB(0;R) 


u  dnx  —  j  u  dnx 

B(xq\R) 


(9.41) 


for  R  >  0.  Note  that  the  integrals  cancel  on  the  intersection  of  the  two  balls, 
(b)  Show  that 


vol  [fl(0;  R)\B(x 0;  7?)]  <  vol  [S( 0;  R)\B(  f;  R  -  a )] 


A 


=  -^[Rn-(R-  ri)l  ’ 


and  the  same  for  #(xq;  R)\B( 0;  R). 
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(c)  Apply  the  volume  estimates  and  the  fact  that  \u\  <  M  to  (9.41)  to  estimate 


w(0) 


u(xq)\  <  2M 


(5  -  rf)n~ 

Rn 


Take  the  limit  R  — >  oo  to  show  that  w(xo)  =  w(0). 

9.3  Suppose  that  Q  c  M"  is  bounded,  with  Q  c  5(0;  5),  and  assume  that  u  e 
C2(£?;  R)  fl  C°(i2)  satisfies 


-A  u  =  /,  =  0, 


for  /  e  C°(J2). 

r\ 

(a)  Find  a  constant  c  >  0  (depending  on  /  and  5),  such  that  u+c  |jc  |  is  subharmonic 
on  ^2. 

r\ 

(b)  For  this  value  of  c,  apply  the  maximum  principle  to  u  +  c  \x  \  to  deduce  that 

max  \u\  <  C  max  |/| , 

Q  Q 


where  C  depends  only  on  R. 

9.4  Suppose  u  is  a  harmonic  function  on  a  domain  that  includes  5(0;  45)  for  some 
R  >  0,  and  assume  u  >  0.  Show  that 

max  u  <  377  min 

5(0; /e)  5(0;  5) 

Hint:  For  x  e  5(0;  5),  apply  the  maximum  principle  to  write  u(x )  as  an  integral 
over  the  balls  5(x;  5)  and  5(x;  35).  Then  show  that 

5(x;  5)  C  5(0;  25)  C  5(x;  35), 

and  use  this  to  estimate  the  integrals. 

9.5  Suppose  u  e  C2(B\  5)  n  C°(5)  is  a  nonconstant  subharmonic  function  and 
assume  that  the  maximum  of  u  on  5  is  attained  at  the  point  Xo  £  dB.  This  automat¬ 
ically  implies  that  |^(xo)  >  0.  Hopf’s  lemma  says  that  this  inequality  is  strict, 

du 

tt(xo)  >  0. 
or 

To  show  this,  let  5  :=  5(0;  5)  C  R”  for  some  5  >  0,  and  set 

A  :=  {5/2  <  |x|  <  5} . 
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(a)  Consider  the  function 

h(x)  :=  e— 2kM2/«2  _e-2«_ 

Compute  Ah  and  show  that  h  is  subharmonic  on  A. 

(b)  Set 


m  =  max  u,  M  =  max  zr, 

{r=/?/2}  {r=fl} 


and  show  that  m  <  M. 

For  e  >  0  set 

u£  :=  u  +  £/i, 

and  show  that  by  taking  e  sufficiently  small  we  may  assume  that 


max  it  :  <  M. 

dA 


(d)  Show  that  us(x )  <  M  for  x  e  A,  and  hence  that 


due 

dr 


(x0)  >  0. 


(e)  By  computing  du£/dr  and  taking  e  — >  0,  conclude  that 


du 

dr 


(x0)  >  0. 


Chapter  10 

Weak  Solutions 


In  Sect.  1 .2  we  observed  that  d’Alembert’s  formula  for  a  solution  of  the  wave  equation 
makes  sense  even  when  the  initial  data  are  not  differentiable.  This  concept  of  a  weak 
solution  that  is  not  actually  required  to  solve  the  equation  literally  has  come  up  in  other 
contexts  as  well,  for  example  in  the  discussion  of  the  traffic  equation  in  Sect.  3.4.  In 
this  chapter  we  will  discuss  the  mathematical  formulation  of  this  generalized  notion 
of  solution. 

Weak  solutions  first  appeared  in  physical  applications  as  idealized,  limiting  cases 
of  true  solutions.  For  example,  one  might  replace  a  smooth  density  function  by  a 
simpler  piecewise  linear  approximation,  as  illustrated  in  Fig.  1 0. 1 ,  in  order  to  simplify 
computations.  (We  used  this  idea  in  Example  3.9.) 

Up  until  the  late  19th  century,  the  limiting  process  by  which  weak  solutions  were 
obtained  was  understood  rather  loosely,  and  justified  mainly  by  physical  intuition. 
Weak  solutions  proved  to  be  extremely  useful,  and  eventually  a  consistent  mathe¬ 
matical  framework  was  developed. 


10.1  Test  Functions  and  Weak  Derivatives 

Consider  a  linear  equation  of  the  form  Lu  =  /,  where  L  is  a  differential  operator 
on  a  domain  Q  CM".  Suppose  that  u  represents  a  physical  quantity  such  as  temper¬ 
ature  or  density.  Direct  observation  of  such  quantities  at  a  single  point  is  a  practical 
impossibility.  Even  the  most  sensitive  instrument  will  only  be  able  to  measure  the 
weighted  average  over  some  small  region. 

To  formalize  this  notion  of  a  local  average,  we  use  the  concept  of  a  test  function 
e  C™t(£2).  The  test  function  defines  a  local  measurement  of  a  quantity  u  through 
the  integral 

I  mpdnx .  (10.1) 

J  a 
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Fig.  10.1  A  smooth  function  and  its  piecewise  linear  approximation 


The  function  ip  plays  the  role  of  a  experimental  probe  that  takes  a  particular  sample 
of  the  values  of  u. 

Let  us  consider  how  we  would  “detect”  a  derivative  using  test  functions.  Suppose 
for  the  moment  that  u  £  C1  (R),  with  u '  =  /.  If  we  measure  this  derivative  associated 
using  the  test  function  ip  e  C££t(R),  the  result  is 


(10.2) 


The  fact  that  ur  =  f  is  equivalent  to  the  statement  that  (10.2)  holds  for  all  ip  e 


C-(M). 

Note  that  the  left-hand  side  could  be  integrated  by  parts,  since  ip  has  compact 
support,  yielding 


up)'  dx 


fip  dx. 


(10.3) 


This  condition  now  makes  sense  even  when  u  fails  to  be  differentiable.  The  only 
requirement  is  that  u  and  /  be  integrable  on  compact  sets,  a  property  we  refer  to  as 
local  integrability .  We  can  say  that  locally  integrable  functions  satisfy  u'  =  f  in  the 
weak  sense  provided  (10.3)  holds  for  all  ip  £  C^t(R). 

To  generalize  this  definition  to  a  domain  Q  c  R” ,  let  us  define  the  space  of  locally 
integrable  functions, 

L\oc(C2)  :=  {/  :  Q  — >  C;  f\x  £  Ll(K)  for  all  compact  K  C  £?}  . 

The  same  equivalence  relation  (7.6)  used  for  Lp  spaces  applies  to  L\oc,  i.e.,  functions 
that  differ  on  a  set  of  measure  zero  are  considered  to  be  the  same. 

Inspired  by  (10.3),  for  u  and  /  £  L11oc(^2)  we  say  that 

du 

dxj  ~  f 


as  a  weak  derivative  if 


(10.4) 


for  all  ip  £  C^t(R).  The  condition  (10.4)  determines  /  uniquely  as  an  element  of 
L\qc{£2),  by  the  following: 
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Lemma  10.1  If  f  e  L\oc(C2)  satisfies 


fip  dnx  =  0 


for  all  e  C™t(£2),  then  f  =  0. 

Proof  It  suffices  to  consider  the  case  when  Q  is  bounded,  since  a  larger  domain 
could  be  subdivided  into  bounded  pieces.  For  bounded  Q  the  local  integrability  of  / 
implies  that  /  e  L2(f2).  By  Theorem 7.5  we  can  choose  a  sequence  ipk  in  C™t(£2) 
such  that  'ipk  — >  /  in  L2(£2).  This  implies  that 

lim  (fi'fik)  =  ll/lb. 

k^oo 

The  inner  products  (/,  are  zero  by  hypothesis,  so  we  conclude  that  /  =  0.  □ 


Example  10.2  In  R,  consider  the  piecewise  linear  function 


9(x)  := 


0,  v  <  0, 

x,  0<jc  <  1, 


1,  v  >  0. 

If  we  ignore  the  points  where  g  is  not  differentiable,  then  we  would  expect  that  the 
derivative  of  g  is  given  by 


fix)  = 


0  <  jc  <  1, 

\x\  >  1. 


These  functions  are  illustrated  in  Fig.  10.2. 

Let  us  check  that  this  works  in  the  sense  of  weak  derivatives.  We  take  ip  e  C™t  (R) 
and  compute 


x'fi'ix)  dx  + 


ip'  dx. 


1 

g'(x) 

1 

Fig.  10.2  A  piecewise  linear  function  and  its  weak  derivative 
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Using  integration  by  parts  on  the  first  term  and  evaluating  the  second  gives 


/oo  n  1 

g^'  dx  =  -0(1)  —  /  ^  dx  —  -0(1) 

-oo  «/0 


•oo 


fip  dx, 


oo 


This  verifies  that  gr  =  f  in  the  weak  sense. 
Example  10.3  For  f  g  R,  define  w  e  L\0C(Q)  by 


0 


w(t)  = 


W-(t ),  £  <  0 

w+(t),  t  >  0, 


(10.5) 


where  tu±  g  C^R).  For  t/j  g  C^t(R) 


/OO  /•()  /•  oo 

wipf  dt  =  —  /  W-i/j' dt  —  /  iu+'0 

-oo  J —oo  J  0 


/()  /»  oo 

w'_ip  dt -\-  /  wf+ipdt. 

-oo  «/0 

The  term  proportional  to  t/?(0)  could  not  possibly  come  from  the  integral  of  xjj  against 
a  locally  integrable  function,  because  the  value  of  the  integrand  at  a  single  point  does 
not  affect  the  integral.  Hence  w  admits  a  weak  derivative  only  under  the  matching 
condition 

u;_(0)  =  w+(  0). 

If  this  is  satisfied,  then  the  derivative  is 


wf_(t),  t  <  0, 
w'+(t),  t  >  0. 


0 


Weak  derivatives  of  higher  order  are  defined  by  an  extension  of  (10.4).  To  write  the 
corresponding  formulas,  it  is  helpful  to  have  a  simplified  notation  for  higher  partials. 
For  each  multi-index  a  =  . . . ,  an)  with  olj  G  No,  we  define  the  differential 

operator  on  R", 


QCX\ 

og1 


QCX-n 

dtf' 


(10.6) 


The  order  of  this  operator  is  denoted  by 
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Repeated  integration  by  parts  introduces  a  minus  sign  for  each  derivative.  Therefore, 
a  function  u  e  L\0C{Q)  admits  a  weak  derivative  Dau  e  L\oc(C2)  if 


(Dau)f  dnx  =  (—  l)|a|  /  uDafdnx 


(10.7) 


for  all  e  C~(tf). 

It  might  seem  that  we  should  distinguish  between  classical  and  weak  derivatives 
in  the  notation.  This  is  made  unnecessary  by  the  following: 

Theorem  10.4  (Consistency  of  weak  derivatives)  If  u  e  Cm(k2)  then  u  is  weakly 
differentiable  to  order  k  and  the  weak  derivatives  equal  the  classical  derivatives. 

Conversely,  ifu€  L\oc(C2)  admits  weak  derivatives  Daufor  \a\  <  m,  and  each 
Dau  can  be  represented  by  a  continuous  function,  then  u  is  equivalent  to  a  function 
in  Cm(k2)  whose  classical  derivatives  match  the  weak  derivatives. 

In  one  direction  the  argument  is  straightfoward.  Classical  derivatives  satisfy  the 
criterion  (10.7)  by  integration  by  parts,  so  they  automatically  qualify  as  weak  deriv¬ 
atives.  Lemma  10.1  shows  that  weak  derivatives  are  uniquely  defined. 

The  argument  for  the  converse  statement,  that  continuity  of  the  weak  derivatives 
Dau  implies  classical  differentiability,  is  much  more  technical  and  we  will  not  be 
able  to  give  the  details.  The  basic  idea  is  to  show  that  one  can  approximate  u  by  a 
sequence  fk  €  C™t(£2)  such  that  Dafk  — >  Dau  uniformly  on  every  compact  subset 
of  Q  for  |  a  |  <  m.  The  fact  that  the  weak  derivatives  Dau  are  continuous  makes 
this  possible.  Uniform  convergence  then  allows  the  classical  derivatives  of  u  to  be 
computed  as  limits  of  the  functions  Daf>k. 
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Consider  the  continuity  equation  on  R  introduced  in  Sect.  3.1, 

du  dq 

~dt+dx=0,  u\t=o  =  9-  (10.8) 

The  flux  q  could  depend  on  u  as  well  as  t  and  x.  To  allow  for  the  nonlinear  case,  we 
will  assume  that  u  is  real-valued  here. 

Suppose  for  the  moment  that  q  is  differentiable  and  u  is  a  classical  solution  of 
(10.8).  Let  f  be  a  test  function  in  C^t([0,  oo)  x  R).  Use  of  the  closed  interval  [0,  oo) 

means  that  fj  and  its  derivatives  are  not  necessarily  zero  at  t  =  0.  Pairing  with  f 
and  integrating  by  parts  thus  generates  a  boundary  term, 


dt. 


oo 
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On  the  other  hand,  the  spatial  integration  parts  has  no  boundary  term, 


•oo 


dq 


— oo  UA/ 


p;  dx  =  — 


• oo 


— oo 


d ip 

q  — —  ax 
ox 


When  the  left-hand  side  of  (10.8)  is  paired  with  p)  and  integrated  over  both  t  and  x, 
the  result  is  thus 


‘°°  r°°rdu  dq 

—  +  — 

_oc  L  dt  dx  - 


•OO  pOO  _ 


p)  dx  dt  —  — 


dp)  dp) 

u  — - h  q 


0  J  —  oo  ^  dt 
•oo 

up)\t=o  dx. 

— oo 


dx  - 


p)  dx  dt 


If  u  is  a  classical  solution  of  (10.8),  then 


•oo  poo  _ 


-oo 


dp)  dp) 

u  — — h  q 


L  dt 


dx  - 


•oo 


dx  dt  +  /  gi)\t= o  dx  =  0 


(10.9) 


-oo 


for  all  p)  e  C^t([0,  oo)  x  R). 

The  t  =  0  integral  in  (10.9)  makes  sense  for  g  e  L11oc(M).  Under  this  assumption, 
we  define  u  e  L11oc(( 0,  oo)  x  R;  R)  to  be  a  weak  solution  of  (10.8)  provided  q  e 
L11oc(( 0,  oo)  x  R;  R)  and  (10.9)  holds  for  all  test  functions. 

Example  10.5  Consider  the  linear  conservation  equation  with  constant  velocity, 
which  means  q  =  cu  in  (10.8).  By  the  method  of  characteristics  (Theorem 3.2), 
the  solution  is 

u{t i  x)  =  g(x  —  ct). 

Let  us  check  that  this  defines  a  weak  solution  for  g  e  L11oc(M). 

For  pj  e  C^t(R)  the  first  term  in  (10.9)  is 


•oo  poo 


g{x  -  ct) 


0  J  — oo 


dp)  dp) 

—  (f,x)  -  c—(t,x) 


dx  dt. 


To  evaluate  the  integral,  introduce  the  variables 


r  =  t,  y  =  x  —  ct, 


and  define 


iP(t,  y )  :=  i Kt,  y  +  ct). 


(10.10) 


By  the  chain  rule, 


dp)  dp)  dp) 

=  Clhc 


dr  dt 


(10.11) 


10.2  Weak  Solutions  of  Continuity  Equations 


183 


The  Jacobian  determinant  of  the  transformation  (r,  y)  i->  (t,  x)  is  1,  so  that 


•OO  pOO 


U 


—  OO 


dp)  dp) 

-  dt  dx  - 


•OO  pOO 


dx  dt  = 


—  OO 


dp) 

g(y)—  dydr. 

dr 


The  r  integration  can  now  be  done  directly, 


•OO 


r0 


d'ip 


(r,  y)  dr  =  —0(0,  y). 


This  gives 


•OO  poo 


u 


-OO 


dp)  dp) 

-  dt  dx  - 


dx  dt  =  — 


•OO 

-OO 

•OO 

-OO 


g(y)p)(0,  y)  dy 


g(y)p)( 0,  y)  dy , 


which  verifies  (10.9).  0 

We  saw  in  Example  10.3  that  in  one  dimension  a  jump  discontinuity  precludes 
the  existence  of  a  weak  derivative.  Example  10.5  shows  that  this  is  not  the  case  in 
higher  dimension.  For  g  e  Lj^R),  the  solution  u(t,  x)  =  g(x  —  ct)  could  be  highly 
discontinuous.  The  direction  of  the  derivative  is  crucial  here;  regularity  is  required 
only  along  the  characteristics. 

As  an  application  of  the  weak  formulation  (10.9),  let  us  return  to  an  issue  that 
arose  in  the  traffic  model  in  Sect.  3.4.  For  certain  initial  conditions  the  characteristic 
lines  crossed  each  other,  ruling  out  a  classical  solution  of  the  PDE.  We  will  see  that 
weak  solutions  can  still  exist  in  this  case. 

Consider  a  one-dimensional  quasilinear  equation  of  the  form 


du 

~dt 


d 

dx 


q(u )  =  0 


(10.12) 


with  the  flux  q  (u)  a  smooth  function  of  u  which  is  independent  of  t  and  v.  As  we  saw 
in  Sect.  3.4,  the  characteristics  are  straight  lines  whose  slope  depends  on  the  initial 
conditions.  Let  us  study  the  situation  pictured  in  Fig.  3.11,  where  a  shock  forms  as 
characteristic  lines  cross  at  some  point.  For  simplicity  we  assume  that  the  initial 
crossing  occurs  at  the  origin. 

One  possible  way  to  resolve  the  issue  of  crossing  characteristics  is  to  subdivide  the 
(t ,  x)  plane  into  two  regions  by  drawing  a  shock  curve  C,  as  illustrated  in  Fig.  10.3. 
Suppose  that  classical  solutions  u±  are  derived  by  the  method  of  characteristics  above 
and  below  this  curve.  We  will  show  that  this  combination  yields  a  weak  solution 
provided  a  certain  jump  condition  is  satisfied  along  C.  The  jump  condition  was 
discovered  in  the  19th  century  by  engineers  William  Rankine  and  Pierre  Hugoniot, 
who  developed  the  first  theories  of  shock  waves  in  the  context  of  gas  dynamics. 
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Fig.  10.3  Shock  curve  with 
solutions  u±  on  either  side 


Theorem  10.6  (Rankine-Hugoniot  condition)  Let  C  be  a  curve  parametrized  as 
x  =  cr(t)  with  cf  e  C![0,  oo).  Suppose  that  u  is  a  weak  solution  of  (10.12)  given  by 


u(t ,  x) 


U-(t ,  x),  x  <  a(t ), 
u+(t ,  x),  x  >  cr(t), 


where  u±  are  classical  solutions.  Then,  at  each  point  of  C, 


q(u+)  —  q(u _)  =  ( u+  —  U-)crf. 


(10.13) 


Proof  Since  we  are  not  concerned  with  the  boundary  conditions,  we  consider  a  test 
function  f  e  C^t((0,  oo)  x  R),  for  which  (10.9)  specializes  to 


*oo 


•oo  ^ 


-oo 


df  df 

u— — |-  q(u) 


L  dt 


dx  - 


dx  dt  =  0. 


(10.14) 


Since  the  solutions  u±  are  classical  and  a  is  C1,  we  can  separate  the  integral  (10.14) 
at  the  shock  curve  and  integrate  by  parts  on  either  side. 

Consider  first  the  U-  side.  For  the  term  involving  the  v  derivative  the  integration 
by  parts  is  straightforward: 


d 

—q(u-)  dx  dt 
ox 


+ 


ipq(u-) 


x=a(t ) 


dt. 


For  the  t -derivative  term  we  start  by  using  the  fundamental  theorem  of  calculus  to 
derive 
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d  r(t) 


*cr (0  r 


dt 


p)U-  dx  ~ 


oo 


dp) 


—oo  - 


U-  +  ip 


du _ - 

~dT. 


dx  +  crf(t)ipu _ 


x—a(t ) 


By  the  compact  support  of  pj,  the  integral  over  t  of  the  left-hand  side  vanishes, 
yielding 

•oo  ra{t)  foc  fa(t)  gu 


-oo 


^  U-  dx  dt  =  — 
dt  jo 


ip- 


-OO 


dt 


dx  dt 


•OO 


C j\t)lpu . 


x=a(t ) 


dt, 


Combining  these  integration  by  parts  formulas  gives 


•oo  ra(t)  dp)l 

U-— — |-  q(u-)——  dx  dt 
oo  L  at  ox  - 

00  fa(t)  r  du_  d 

/  ^  ~bT  ~ 

0  J -oo  L  Ot  OX 

oo  _ 


r0  L 


( j'(t)u-  —  q(u _) 


pj 


dt, 


x=a(t ) 


The  first  term  on  the  right  vanishes  by  the  assumption  that  U-  is  a  classical  solution, 
leaving 


•oo  rcr(t) 


-oo 


dp)  dp) 

U-  — - 1-  q(u-) 


dt 


dx 


•oo  _ 


dx  dt  =  — 


cr'u-  —  q{u-) 


pj 


dt, 


x=a(t ) 


The  corresponding  calculation  on  the  u+  side  yields 


•oo  poo 


0  Ja(t)  L 


dp)  dp) 

u+—  +  #(m+) 


dt 


dx 


•oo 


dt  = 


ro  L 


011+  —  q(u+) 


p) 


dt. 


x—a(t) 


By  (10.14)  the  sum  of  the  U-  and  u+  integrals  is  zero,  which  implies 


•oo  _ 


(u+  —  U-)cr'  —  ( q(u+ )  —  q(u _))  p) 


dt, 


x=a(t ) 


Since  this  holds  for  all  p)  e  C°°(( 0,  oo)  x  R),  we  conclude  that 


(u+  —  U-)a  —  ( q(u+ )  —  q(u-)) 


=  0 


x=a(t) 


for  all  t  >  0. 

Example  10.7  Consider  the  traffic  equation  introduced  in  Sect.  3.4, 


□ 
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du  du 

m  +{X~2u)d~x  =  °’ 


for  which  q(u)  =  u  —  u2.  For  the  initial  condition  take  a  step  function, 


g(x)  := 


a ,  x  <  0, 

b,  x  >  0. 


(10.15) 


From  3.28  the  characteristic  lines  are  given  by 


x(t)  = 


xo  +  (1  —  2 a)t,  xo  <  0, 
xq  +  (1  —  2 b)t,  xq  >  0. 


These  intersect  to  form  a  shock  provided  a  <  b. 

The  solutions  above  and  below  the  shock  line  are  given  by  constants, 

U-(t,x)=a ,  u+(t,x)  =  b. 

The  Rankine-Hugoniot  condition  (10.13)  thus  reduces  to 

q(b)  —  q(a)  =  (b  —  a)a' . 

Substituting  with  q(u)  =  u  —  u2  reduces  this  condition  to 


a  =  1  —  b  —  a. 


Since  the  discontinuity  starts  at  the  origin,  the  shock  curve  is  thus  given  by 


cr(t)  =  (1  —  b  —  a)t. 


Hence  the  weak  solution  is 


u(t,  x) 


a ,  x  <  (l  —  b  —  a)t, 

b,  x  >  (1  —  b  —  a)t. 


Some  cases  are  illustrated  in  Fig.  10.4.  In  the  plot  on  the  left,  the  shock  wave 
propagates  backwards.  0 

For  certain  initial  conditions,  the  definition  (10.9)  of  a  weak  solution  is  not  suf¬ 
ficient  to  determine  the  solution  uniquely.  For  example,  if  we  had  taken  a  >  b  in 
(10.15),  then  instead  of  overlapping  the  characteristics  originating  from  t  =  0  would 
separate,  leaving  a  triangular  region  with  no  characteristic  lines.  An  additional  physi¬ 
cal  condition  is  required  to  specify  the  solution  uniquely  in  this  case.  We  will  discuss 
this  further  in  the  exercises. 
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Fig.  10.4  Characteristic  lines  meeting  at  the  shock  wave 


10.3  Sobolev  Spaces 

Boundary  values  are  not  well  defined  for  locally  integrable  functions.  We  were  able 
to  avoid  this  issue  in  the  discussion  of  the  continuity  equation  in  Sect.  10.2,  because 
solutions  were  required  to  be  constant  along  characteristics.  In  general,  the  formu¬ 
lation  of  boundary  or  initial  conditions  requires  a  class  of  functions  with  greater 
regularity. 

The  most  obvious  class  to  consider  consists  of  functions  that  admit  weak  higher 
partial  derivatives.  However,  it  proves  to  be  very  helpful  to  strengthen  the  integrability 
requirements  as  well.  Such  function  spaces  were  introduced  by  Sergei  Sobolev  in 
the  mid  20th  century  and  have  since  become  fundamental  tools  of  analysis. 

The  Sobolev  spaces  based  on  L2  are  defined  by 

:=  {u  e  L2{Q)\  Dau  e  L2{£2)  for  all  \a\  <  m } , 

for  m  e  No,  with  derivatives  interpreted  in  the  weak  sense.  An  extended  family  of 
Sobolev  spaces  Wm,p  is  given  by  replacing  L2  with  Lp  in  the  definition.  The  extended 
family  is  important  in  the  analysis  of  nonlinear  PDE,  but  our  focus  will  be  limited  to 
linear  applications  involving  Hm . 

Sobolev  spaces  are  useful  as  theoretical  tools,  but  they  also  have  a  practical  side. 
For  a  bounded  domain  Q,  the  space  Hl  (£2)  includes  the  continuous  piecewise  linear 
functions.  A  function  is  called  piecewise  linear  if  the  domain  can  be  decomposed  into 
a  finite  number  of  polygonal  subdomains,  on  which  the  function  is  linear.  Figure  10.5 
shows  a  two-dimensional  example.  Sobolev  spaces  provide  a  natural  framework  for 
the  approximation  of  solutions  by  computationally  simple  classes  of  functions. 

The  space  Hm(£2)  carries  a  natural  inner  product, 


188 


10  Weak  Solutions 


Fig.  10.5  Graph  of  a 
piecewise  linear  H[  function 
on  the  unit  square 


(u,  v)H»’  :=  ^  {Dau,  Dav).  (10.16) 

\a\  <m 


(Our  convention  will  be  that  a  bracket  without  subscript  denotes  the  L 2  inner  product.) 
The  corresponding  norm  is 


||m||h™:=(^  \\Dau\\iy.  (10.17) 

\a\<m 

Lebesgue  integration  theory  gives  us  the  following  completeness  result,  analogous 
to  Theorem 7.7. 

Theorem  10.8  For  Q  CM"  and  m  e  No,  Hm(£2)  is  a  Hilbert  space. 

Recall  that  Theorem 7.5  says  that  C™fQ)  is  a  dense  subspace  of  L2(£2).  This 
means  that  the  closure  of  C™t(£2)  with  respect  to  the  L 2  norm  is  L2(f2).  This  result 
no  longer  holds  for  the  Sobolev  space  Hm(f2)  with  m  >  1.  In  particular,  the  closure 
of  C™t(Q)  with  respect  to  the  Hl  norm  defines  a  subspace 


H{\(Q)  = 


u 


e  Hm(ny,  lim  \\u  -  ^|Ih>  =  0  for  e  C~  (fl) 


-oo 


(10.18) 


By  Lemma7.8  //(J  (£2)  is  also  a  Hilbert  space  with  respect  to  the  Hl  norm. 

If  d£2  is  piecewise  C1,  then  for  functions  in  Hl  (£2)  it  is  possible  to  define  bound¬ 
ary  restrictions  in  L2(d£2)  that  generalize  the  boundary  restriction  of  a  continuous 
function.  In  this  case,  //(]  (  f2)  consists  precisely  of  the  functions  whose  boundary 
restriction  vanishes.  Thus  the  space  Hq(Q)  can  be  interpreted  as  the  class  of  Hl 
functions  satisfying  Dirichlet  boundary  conditions  on  d£2 . 

The  theory  of  boundary  restrictions  is  too  technical  for  us  to  cover  here,  but  we 
can  at  least  show  how  this  works  in  the  one-dimensional  case. 

Theorem  10.9  If  u  e  H^(a,  b)  then  u  is  continuous  on  [a,  b]  and  equal  to  zero  at 
the  endpoints. 

Proof  Suppose  u  e  //(J  (a ,  b).  By  definition,  there  exists  a  sequence  of  C™t(a,  b) 
such  that 
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lim  || tpk  ~  «IIh‘  =  0. 

k^OO 


For  x  e  \a,  b]. 


px 

i>j(x)  -  ipk(x)  =  /  Wj(t) 

J  a 


dt 


The  integral  on  the  right  could  be  expressed  as  an  inner  product  on  R, 


-  ^(t)]  dt  =  {i/j'j 


VX’  X[<2,x])? 


where  xi  denotes  the  characteristic  function  of  the  interval  I.  Thus,  by  the  Cauchy - 
Schwarz  inequality  (Theorem 7.1), 


WjW  ~  ipk(x)\  <  y/x-a  \\ip'j  -  ^'k\\2. 

In  view  of  the  definition  of  the  Hl  norm,  this  implies  the  uniform  bound 

II Vv  -  Moo  <  \\iPj-Mh'  00.19) 

Since  {^}  converges  and  is  therefore  Cauchy  with  respect  to  the  H{  norm,  it 
follows  from  (10.19)  implies  that  the  sequence  {^}  is  also  Cauchy  in  the  uniform 
sense.  By  the  completeness  of  L°°(a,  b)  (Theorem 7.7)  and  Lemma  8.4,  this  implies 
that  ipk  — >  g  uniformly  for  some  g  e  C°[a,  b]. 

At  this  point  we  have  ipk  — >  u  in  H 1  and  ipk  — >  g  uniformly.  Uniform  convergence 
on  a  bounded  interval  implies  convergence  in  L2,  by  a  simple  integral  estimate. 
Therefore  ^  >  g  in  L 2  also,  implying  that  u  =  g  in  L2.  Hence  u  e  C°[a,b]. 

To  show  that  u  vanishes  at  the  endpoints,  note  that 


max {\u(a)\,  \u(b)\)  <  sup  \  —  u 

[a,b] 


(10.20) 


because  ^(a)  =  ^Pk(b)  —  0-  By  uniform  convergence,  the  left-hand  side  of  (10.20) 
approaches  zero  as  k  — >  oo,  showing  that 

u(a )  =  u(b)  ~  0. 


□ 

In  higher  dimensions,  functions  in  H{  are  not  necessarily  continuous.  However, 
Hm  does  imply  continuity  if  m  is  sufficiently  large  relative  to  the  dimension.  We  will 
develop  this  regularity  theory  in  Sect.  10.4. 

We  conclude  this  section  with  an  extension  property  that  will  prove  useful  in 
Chap.  11. 
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Lemma  10.10  For  Q  C  F2  C  R”,  the  extension  by  zero  of  an  element  of  H{]  (  fi  ) 
gives  an  element  of  Hq  (£2). 

Proof  For  u  e  H{]  (f2),  let  u  denote  the  extension  by  zero  to  Q .  The  weak  gradient 
Vu  g  L2(£2\  M77)  can  also  be  extended  by  zero  to  Vm  e  L2(k2;  M77).  We  need  to 
show  that  Vw  is  the  weak  gradient  of  u.  This  is  the  condition  that 


(jNu  dnx 


uV (j>  dnx, 


(10.21) 


for  all  f  e  C~(X2). 

By  the  definition  of  //(]  (f2),  there  exists  a  sequence  of  £  C^t(£?)  such  that 

fk  — >  m  in  the  7/1  norm.  Since  ft  has  compact  support  within  £?,  integration  by 
parts  gives 


f'Vfk  dnx 


fkV(j)dnx 


(10.22) 


By  the  Hl  convergence  ipk  u,  we  can  take  the  limit  k  ->  oo  on  both  sides  of 
(10.22)  to  obtain 


fVudnx  =  -  /  uV(j)dnx 


Since  and  Vw  are  equal  to  u  and  S7u  on  Q  and  vanish  on  Q  —  Q,  this  is  equivalent 
to  (10.21).  □ 


10.4  Sobolev  Regularity 

In  this  section  we  will  consider  the  relationship  between  weak  regularity,  defined  in 
terms  of  Sobolev  spaces,  and  regularity  in  the  classical  sense.  This  connection  plays 
a  central  role  in  the  application  of  Sobolev  spaces  to  PDE. 

Theorem  10.11  (Sobolev  embedding  theorem)  Suppose  Q  C  M77  is  a  bounded 
domain.  If  m  >  k  +  then 

Hm{Q)  C  Ck{Q). 

This  result  can  be  sharpened  and  extended  in  various  ways.  One  important  variant 
includes  differentiability  up  to  the  boundary  under  certain  conditions  on  dk2.  For 
example,  if  the  boundary  dk2  is  piecewise  C1  then  it  is  possible  to  show  that 

c  Ck  (Q) . 

These  boundary  results  are  quite  important  but  too  technically  difficult  for  us  to 
include  here. 
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The  strategy  we  will  use  for  TheoremlO.il  is  based  on  the  connection  estab¬ 
lished  in  Sect.  8.6  between  regularity  and  the  decay  of  Fourier  coefficients.  Recall 
the  definition  T  :=  M/27tZ  introduced  in  Sect.  8.2.  To  extend  Fourier  series  to  higher 
dimensions  we  introduce  the  corresponding  space 


Tn  :=  M7(27rZ)n. 


A  function  on  T77  is  a  function  on  M77  which  is  27r-periodic  in  each  coordinate. 

The  periodic  Fourier  series  theory  from  can  be  carried  over  to  T77  directly.  For 
/  g  L2(T77)  and  k  g  Z77  we  define  the  coefficients 


ck[f]  := 


1 


—ik  x 


(27 r) 


n 


/( x)  dnx. 


TK 


(10.23) 


The  integral  over  T77  can  be  taken  over  [—7 r,  7r]n,  or  any  translate  of  this  cube.  The 
argument  from  Theorem  8.6  can  be  adapted,  with  minor  notational  changes,  to  prove 
the  following: 

Theorem  10.12  For  f  e  L2( T77),  the  series 


X c*  [/]«'*■* 

keZn 


converges  to  f  in  the  L 2  norm. 

As  a  corollary,  we  obtain  the  generalization  of  the  Parseval  identity  (8.36), 

(/,  g)  =  (2tt)"  X  (10.24) 

keZ'1 


for  /,  g  e  L2( T»). 

Because  of  the  periodic  structure  of  T7 ,  it  is  not  necessary  to  assume  that  test  func¬ 
tions  have  compact  support.  For  /  g  L\oc(T)  the  weak  derivative  Da  f  g  L/oc(T)  is 
defined  by  the  condition  that 

7T  7T 

/  ipDafdx  =  (-  l)M  fDaipdx  (10.25) 

Jo  Jo 

for  all  ip  G  C°°(T).  The  space  Hm( T77)  consists  of  functions  in  L2( T77)  which  have 
weak  partial  derivatives  up  to  order  m  contained  in  L2(T77). 

It  is  convenient  notate  powers  of  the  components  of  k  by  analogy  with  Da , 

ka  :=k?  ---C, 


for  a  =  (cki,  . . . ,  an)  with  aj  G  Nq.  A  simple  computation  shows  that 
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Daeikx  =  (ik)aeikx. 

Thus,  for  /  e  Hm{ T),  substituting  elk  x  into  (10.25)  gives 

ck[Daf]  =  ( ik)ack[f ]  (10.26) 

for  \a\  <  m.  This  generalizes  the  integration  by  parts  formula  (8.30). 

Theorem  10.13  A  function  f  e  L2(T)  lies  in  Hm  (T)  for  m  e  N  if  and  only  if 

X  l*l2m  lc*[/]|2  <  00.  (10.27) 

keZn 

Proof  By  (10.26)  and  Bessel’s  inequality  (Proposition  7.9),  the  condition  that  Da  f  e 
L2( T”)  implies  that 

^\kack[f]  |2  <  oo.  (10.28) 

ke  Z” 

This  holds  for  all  |a|  <  m,  implying  (10.27). 

Conversely,  if  /  e  L2( T”)  satisfies  (10.27),  then  (10.28)  holds  for  \a\  <  m.  We 
can  therefore  define  functions  ga  e  L2( T")  by  the  Fourier  series 

5a (x)  :=  ^(ik)ack[f]eikx. 

keZn 

By  Parseval’s  identity  (10.24),  the  inner  product  of  ga  with  f  e  C^t(Tw)  gives 

{ga,lp)  =  (27T)"  y^XikCck[flPkW 

keZn 

=  (-l)'“'2fXci[/W 

ke  Z 

=  (-l  )W(f,Dail>). 

This  shows  that  the  weak  derivative  Daf  exists  and  is  equal  to  ga.  □ 

Theorem  10.13  makes  the  connection  between  Sobolev  regularity  and  decay  of 
Fourier  coefficients.  Our  task  is  now  to  translate  this  back  into  classical  regularity. 

Theorem  10.14  (Periodic  Sobolev  embedding)  Ifm  >  q  +  then 

Hm(Tn)  C  Cq( Tn). 

Proof  Using  the  notation  for  discrete  spaces  introduced  in  Sect.  7.4,  the  space  £ 2  (Z") 
is  defined  as  the  Hilbert  space  of  functions  >  C,  equipped  with  the  inner  product 
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(f3, 7>€2  :=  ^  /?(*) 7(fc). 

keZn 


Consider  the  function 


/?(*)  :=  (1  +  1*1) 


— m 


The  i 2  norm  of  (5  can  be  estimated  with  an  integral, 


ll/?l&  :=  £(i  +  |*|) 


—2m 


keZri 


<  I  (1  +  \x\)~2m  dnx 


■OO 


=  An  I  (1  +  r)-2mrn~l  dr. 
0 


The  integral  is  finite  if  2m  >  n ,  implying  that  /3  e  lz(Zn)  for  m  > 

By  Theorem  10.13,  for  /  e  Hm(Tn )  we  can  also  define  an  element  of  f2(Zw)  by 


7(*)  (1  +  \k\)m\cklf]\, 


so  that 

(/),7),2  =  ^|ct[/]|. 

keZn 


It  then  follows  from  the  Cauchy-Schwarz  inequality  on  i 2  that 

Xlc*[/]l<ll^ll7ll^.  (10-29) 

ke  Z« 


which  is  finite  for  m  >  | . 

Since  =  1,  the  estimate  (10.29)  implies  that  the  Fourier  series  for  /  con¬ 
verges  uniformly.  By  Lemma  8.4  the  limit  of  this  series  is  continuous.  Thus,  after 
possible  replacement  by  an  equivalent  function  in  L2,  /  is  continuous. 

This  argument  shows  that 


Hm(Tn)  c  C°(T")  (10.30) 

form  >  | .  To  apply  it  to  higher  derivatives  we  note  that  if  /  e  Hm(Tn)form  >  tf  +  f 
then  for  |  a  |  <  q  the  weak  derivatives  Da  f  will  lie  in  Hm~q(TN).  For  m  >  q  +  | 
it  follows  from  (10.30)  that  these  derivatives  are  continuous.  By  Theorem  10.4,  this 
shows  that  u  e  Cq( Tn).  □ 

We  are  now  prepared  to  derive  the  Sobolev  embedding  result  for  a  bounded  domain 
as  a  consequence  of  Theorem  10.14. 
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Proof  of  Theorem  10.11  Suppose  u  e  Hm(T2)  for  Q  gI",  letxo  e  T2.  Because  Q 
is  open,  we  can  choose  £  >  0  small  enough  that 

B(x0 ;  e)  C  T2 . 

Suppose  that  ip  e  C™t(T2)  has  support  contained  in  B(x o;  s)  and  is  equal  to  1 
inside  B(x o;  s/2).  (Such  a  function  can  be  constructed  as  in  Example 2.2.)  Since 
f  is  smooth,  mp  e  Hm(T2)  also.  Thus,  assuming  e  <  2i r,  we  can  extend  mp  by 
periodicity  to  a  function  in  Hm(Tn).  Theorem  10.14  then  shows  that  mp  e  Ck(Tn) 
if  m  >  k  +  n/2.  Since  mp  and  u  agree  in  a  neighborhood  of  jcq,  this  shows  that  u 
is  k-times  continuously  differentiable  at  jco-  This  argument  applies  at  every  interior 
point  of  Q,  so  we  conclude  that  u  e  Ck (£2).  □ 


10.5  Weak  Formulation  of  Elliptic  Equations 

The  Laplace  equation  introduced  in  Sect.  9.1  is  the  prototypical  elliptic  equation. 
Another  classic  example  is  the  Poisson  equation  —  A u  =  /,  which  we  will  discuss 
in  more  detail  in  Sect.  11.1. 

If  Q  c  W1  is  a  bounded  domain,  then  for  u,ip  e  C™t(T2),  Green’s  first  identity 
(Theorem  2.10)  gives 


fAu  dnx  =  —  I  Vw  •  Vvp  dnx 
Q  J  Q 


(10.31) 


On  the  other  hand,  the  H 1  inner  product  on  Q  is  given  by 


(«,  f>)Hl  :=  (m,  f>)  +  /  Vu'Vf>dnx, 


(10.32) 


J  Q 

The  right-hand  side  of  (10.31)  is  thus  well-defined  for  u  e  Hq(T2). 

To  account  for  applications  to  the  Helmholtz  equation  as  well  as  the  Laplace 
equation,  let  us  consider  the  PDE 


—  Au  =  Xu  +  /,  u\qq  ~  0. 


(10.33) 


Lor  /  e  L2(i2),  we  say  that  u  e  Hq  (Q)  constitutes  a  weak  solution  of  (10.33)  if 


q  L 


Wu  •  V-0  —  A  mp  —  fip 


dnx  =  0 


(10.34) 


for  every  f  e  C™t(T2).  This  definition  could  be  extended  to  more  general  elliptic 
equations  of  the  form  Lu  =  f ,  but  for  simplicity  we  restrict  our  attention  to  the  case 
of  the  Laplacian.  We  will  study  the  existence  and  regularity  of  solutions  of  (10.34) 
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extensively  in  Chap.  11.  For  now,  we  consider  a  simple  one-dimensional  case  to 
illustrate  how  the  definition  works. 

Example  10.15  On  the  interval  [0,  2],  consider  the  equation 

u"  =  /,  u(  0)  =  u(2)  =  0, 


with 


0  <  jc  <  1, 
1  <  v  <  2. 


Since  /  is  piecewise  linear,  it  makes  sense  to  try  using  classical  solutions  on  the  two 
subintervals.  Imposing  the  boundary  and  continuity  requirements  gives  a  family  of 
possible  solutions 


u(x)  = 


1  3 

^ x  —  ax, 
i  ,,2 


0  <  jc  <  1, 


~2X  +  (a  +  ^)x  —  2a  —  3,  l<v<2. 


To  determine  a  we  apply  the  weak  solution  condition, 


•2^ 


ro  L 


u  'ip'  +  ftp 


dx  =  0, 


(10.35) 


for  0  e  C^t(0,  2).  Using  integration  by  parts,  the  first  term  evaluates  to 


0 


u'^p' dx  =  J  (|v2  —  a)'ip'(x)  dx  +  J  (-x-\-a-\-^)/ip'(x)dx 

dx  =  (|  —  r?)0(l)  —  J  X2p(x)  dx  —  (a  +  |)0(1)  +  J  0(v)  dx 

=  (l  ~  2fl)0(l)  -  [  f^dx. 

Jo 


The  weak  solution  condition  (10.35)  requires  a  =  This  gives 


u(x)  = 


1  y3 _ Lv 

12a’ 


6 


_ X  r2  1  17 

2a  '  12a 


5 

6’ 


0  <  JC  <  1, 

1  <  v  <  2. 


This  result  is  illustrated  in  Fig.  10.6.  Note  that  the  condition  on  a  corresponds  to 
a  matching  of  the  first  derivatives  at  v  =  1,  so  that  u  e  C1  [0,  2].  0 

We  will  show  in  Sect.  11.3  that  solutions  of  (10.34)  are  unique,  so  the  function 
obtained  in  Example  10.15  is  the  only  possible  solution.  The  matching  of  derivatives 
required  for  this  solution  is  indicative  of  a  more  general  regularity  property  for 
solutions  of  elliptic  equation,  which  we  will  discuss  in  detail  in  Sect.  11.4. 


196 


10  Weak  Solutions 


Fig.  10.6  One-dimensional 
weak  solution 


O.lf 


0.1 


0.05 


10.6  Weak  Formulation  of  Evolution  Equations 

The  heat  and  wave  equations  are  the  primary  examples  of  linear  evolution  equations. 
Weak  solutions  for  these  equations  can  be  defined  by  essentially  the  same  strategy 
used  in  Sect.  10.5.  Starting  from  a  classical  solution,  we  pair  with  a  test  function  and 
use  integration  by  parts  to  find  the  corresponding  integral  equation.  Unfortunately, 
the  time  dependence  creates  some  technicalities  in  the  definition  that  we  are  not 
equipped  to  fully  resolve  here,  but  we  can  at  least  illustrate  the  basic  philosophy  by 
working  through  some  examples. 

Consider  first  the  wave  equation  on  a  bounded  domain  Q  C  M/2  with  Dirichlet 
boundary  conditions, 


d2u 


(10.36) 


-  A//  =  0.  u \xedn  =  0, 


subject  to  the  initial  conditions 


du 


u\t=o  =  g,  ttt  =h. 
at  t= o 


Assuming  u  is  a  classical  solution,  pairing  the  wave  equation  for  u  with  a  test  function 
^  €  C~([0,  oo)  x  £2)  gives 


Integration  by  parts  for  the  A u  term  works  just  as  in  (10.31),  yielding 


In  the  t  variable,  we  integrate  by  parts  twice  and  pick  up  a  boundary  term  each  time 
because  the  test  function  is  not  assumed  to  vanish  at  t  =  0.  The  result  is 
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*=0 


dip  du 

Jo  dt  dt 


+  hip\t= 0  + 
t= 0 


Combining  this  with  the  spatial  integral  yields 


+  Vw  •  Vip 


dnx  dt 


t= 0 


dnx  +  /  hip\t=odnx. 


(10.37) 


As  in  Sect.  10.5,  the  Dirichlet  boundary  condition  is  imposed  by  assuming  that 
u(t,  •)  e  Hq(Q)  for  all  t.  To  make  sense  of  the  boundary  and  initial  terms  we  need 
to  assume  at  least  that  g  e  L\oc(£2)  and  h  e  L\oc [0,  00).  To  interpret  (10.37)  we 
also  need  to  require  that  the  spatial  pairing  of  Vw  with  Vip  is  integrable  over  t.  This 
condition  is  more  technical.  It  turns  out  to  be  sufficient  to  assume  that  || u(t,  Oil//1 
is  integrable  as  a  function  of  t,  but  we  will  not  attempt  to  justify  this  here.  Instead 
we  will  limit  our  discussion  to  examples  for  which  the  existence  of  the  integrals  in 
(10.37)  is  clear. 


Example  10.16  Consider  the  piecewise  linear  d’Alembert  solution  for  the  wave 
equation  introduced  in  Sect.  1.2.  On  [0,2]  we  take  the  initial  conditions  h  =  0 
and 


g(x) 


x,  0<jc  <  1, 
2  —  x,  1  <  x  <  2. 


By  Theorem 4.5,  d’Alembert’s  solution  is  given  by  extending  g  to  an  odd  periodic 
function  on  R  with  period  4,  and  then  setting 


u(t,  x) 


g(x  + 1)  +  g(x 


The  linear  components  of  the  resulting  solution  are  shown  in  Fig.  10.7.  Because  u  is 
piecewise  linear  and  vanishes  at  v  =  0  and  2,  it  is  clear  that  u(t,  •)  e  Hq(0,  2)  for 
each  t. 

For  this  case  the  weak  solution  condition  (10.37)  specializes  to 


du  dip~ 
dx  dx  - 


dx  dt  = 


(10.38) 


Checking  this  is  essentially  a  matter  of  integration  by  parts,  but  the  integrals  must 
be  broken  into  many  pieces  for  large  t.  As  a  sample  case,  let  us  assume  that  ip  has 
support  in  [0,  1)  x  (0,  2). 
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Fig.  10.7  Piecewise  linear 
wave  solution  u(t,  x) 


x 


The  first  integral  in  (10.38)  becomes 


fl  f2  d2i> 

/  /  u^r^r  dx  dt  = 

fo  Jo 


1  ^  c  1  —x 


dt 2 


d2p) 

w 


r 

ss 

to 

u  =  x  —  2 

u  =  1  —  t 

U  =  X 

1 

u  =  —x 

1 

2 

d2p)  i 

\L(1 

dx 

o  '-jo 

“2r  d2p)  fx~l  d2p) 

/  (1— 0-^y  dt  +  /  (2-x)  — ^  dt 

t  LJx- 1  ^  Jo 


dt 2 


dx 


1  ~  dp) 

— x— - (0,  x)  —  p)(l  —  x,  x) 


0  L 

*2 

1  - 


dt 


dx 


dip 


-(2-x)—-(0,x)  -  p)(x  -  l,x) 
dt 


dx 


Similarly,  the  second  term  in  (10.38)  evaluates  to 


■oo  p2 


du  dp) 
'o  Jo  dx  dx 


dp) 

dx 


dx  dt 


1  f2  , 

p)(l  —  x,  x)  dx /  p)(x  —  l ,  x)  dx 


Adding  these  pieces  together  gives 


■oo  p 2 o2 


dzp)  du  dp) 
o  Jo  L  dt2  +  dx  dx  - 


Z*1  dp)  f2  dp) 

dx  dt  —  —  /  v— —  (0 ,x)dx—  /  (2  — x)— —  (0 ,  x)  dx 

Jo  dt  J  i  dt 

f2  dp) 

=  -y  g(x)  —  (0,x)dx, 


which  verifies  (10.38)  for  this  case. 


0 
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Now  let  us  consider  the  weak  formulation  of  the  heat  equation  with  Dirichlet 
boundary  conditions, 

du 

- - Au  =  0 ,  u\xedn=0,  u\t=0  =  h.  (10.39) 

at 

Derivation  of  the  integral  equation  works  just  as  for  the  wave  equation,  except  that 
there  is  only  a  single  integration  by  parts  in  the  time  variable.  Assuming  that  u(t ,  •)  e 
H{]  (fi)  for  each  t  >  0  and  h  e  L\oc(C2),  the  weak  solution  condition  is 


+  Vzr  •  V'l/j 


dnx  dt  = 


h'lp^o  dnx 


(10.40) 


for  all  ^  e  C^t([0,  oo)  x  £2). 

Example  10.17  Consider  the  heat  equation  on  the  interval  (0, 7r),  with  initial  con¬ 
dition  h  6  L2( 0, 7r;  R).  In  view  of  the  Dirichlet  boundary  conditions,  we  use  the 
orthonormal  basis  for  L2(0,  it)  developed  in  Exercise  8.4,  given  by  the  sine  func¬ 
tions 

[2 

c Pk(x )  :=  J  -  sin(kv) 

V  7T 


for  k  e  N.  The  coefficients  associated  to  h  are 


ak  := 


h(x)(j)k(x)  dx. 


and  2]  ttk4>k  converges  to  h  in  the  L2  sense  by  Theorem  8.6.  The  corresponding  heat 
solution  is 

oo 

u(t,x)  =  ^ \ake~k2t  (j)k  (x).  (10.41) 

k=  1 


For  example,  solution  corresponding  to  a  step  function  h  is  illustrated  in  Fig.  10.8. 
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As  noted  in  Corollary  8.14,  this  yields  a  classical  solution  if  h  is  continuous.  If  h 
is  not  continuous  then  u  might  not  have  a  well-defined  limit  as  t  — >►  0.  Nevertheless, 
we  can  check  that  the  weak  solution  condition  is  satisfied.  Given  a  (real- valued)  test 
function^  e  C^t([0,  oo)x(0,  i r);  R),  define  the  time-dependent  Fourier  coefficients 


■7T 


bk(t)  :=  /  ip(t,  x)(pk{x)  dx. 

Jo 

By  the  smoothness  of  ip,  Theorem 8. 10  implies  that  the  coefficients  satisfy  bk(t) 
uniformly  in  t,  and  so  the  series 


oo 


ip(t,x)  =  bk{t)4>k(x ) 


k=\ 


converges  uniformly  as  well  as  in  L2.  By  the  same  principle,  the  series 


dip 

~dt 


oo 


(t,x)  =  'y' b'k(t)<j>k(x) 


k= 1 


is  also  uniformly  convergent.  Since  {<pk}  is  an  orthonormal  basis,  we  deduce  from 
Parseval’s  identity  (8.36)  that 


(10.42) 


for  t  >  0. 

Similarly,  for  t  >  0  we  have  L2  convergent  series 


dip 

dx 


(t,x) 


du 

dx 


(t,x) 


cos(kv), 


y^ak(t)e  k2‘k ^ 


cos(kv). 


By  the  Parseval  identity  for  the  cosine  basis,  this  gives 


du  dip 
dx  dx 


oo 

dx  =  dke~k  Tbk(t). 

k= 1 


(10.43) 


Applying  (10.42)  and  (10.43)  to  the  left-hand  side  of  (10.40)  yields 
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(10.44) 
dt. 

Switching  the  order  of  the  summation  and  integration  is  justified  if  series  converges 
uniformly  on  the  domain  of  the  integral,  but  that  is  not  necessarily  the  case  here.  To 
check  this  carefully,  we  break  the  sum  at  some  value  k  =  N.  For  the  finite  sum  there 
is  no  convergence  issue,  so  that 


dt 


N 

=  Y,akbk(0)- 

k= 1 


To  estimate  the  tail  of  the  sum,  note  that  the  sequence  {ak}  is  bounded  because 
^  \ak\2  <  oo.  For  bk(t),  we  apply  repeated  integration  by  parts  to  deduce 

r 71  /  Q  \2 m  r 71  /  d)  \2m 

J  ^k^\dx)  ^^’x^dx=  J  ^  x^\dx)  0ki'X'>  dx 


*7 r 


m  i.2m 


=  (-1  )mk 


t ,  x)cj)k(x)  dx 


=  (-i  )mk2mbk(t). 


for  me  N.  Since  xh  e  C“t([0,  oo)  x  (0, 7r)),  this  gives  an  estimate 


\h(t)\  <  Cmk~ 


-2m 


where  Cm  is  independent  of  t.  The  same  reasoning  applies  to  b'k(t).  Combining  the 
m  =  2  estimate  for  bk  with  the  m  =  1  case  for  b'k  gives 


ake 


—k2t 


( k2bk(t )  -  b'k(t)) 


<  Ck 


-2 


This  shows  that 


y,  ake  kh(k2bk(t)  -  b'k(t)j 
k=N+ 1 


<  cr1, 


(10.45) 


independently  of  t. 
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Now  fix  M  >  0  so  the  support  of  p)  is  contained  in  [0,  M].  Applying  (10.45)  to 
the  integral  gives 


Mr  00 


Z 


ake 


-k 


'-k=N+l 


'(k2h(t)  -  b'k(tp 


dt 


<  CMN 


-l 


Returning  to  (10.44),  our  analysis  of  the  sum  over  k  now  gives 


'  OO  r  7T 


0  JO  L 


dp)  du  dp) 

u - | - p) 

dt  dx  dx 


N 


dnx  dt  =  ^^akbk(0)  +  0(N  !) 


^=i 


By  taking  N  — >►  oo,  we  deduce 


'  OO  pi T 


o  Jo  L 


dp)  du  dp) 

u - ^ - p) 

dt  dx  dx 


oo 


dnx  dt  =  ^  akbk(0) 


k= 1 


On  the  other  hand  Parseval’s  identity  gives 


» TP  OO 

h(x)p)( 0,  x)  dx  =  Z  akbk(  0), 


^=i 


so  the  weak  solution  condition  (10.40)  is  satisfied. 


0 


10.7  Exercises 


10.1  On  R  consider  the  ordinary  differential  equation 

du 

x —  =  1. 
dx 

(a)  Develop  a  weak  formulation  of  this  ODE  in  terms  of  pairing  with  a  test  function 

*  e  C“(R) 

(b)  Show  that  u(x)  =  log  \x\  is  locally  integrable  and  solves  the  equation  in  the 
weak  sense. 

10.2  In  Exercise  3.6  we  studied  Burger’s  equation, 


du  du 

+  u  — —  —  0, 


dt 


dx 


with  the  initial  condition 
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u( 0,  x) 


a,  x  <  0, 

-  a{  1  —  x)  +  bx ,  0  <  x  <  1, 

b,  x  >  1. 


For  a  >  b  a  shock  forms  at  some  positive  time.  Assuming  w  is  a  weak  solution,  find 
the  equation  of  the  shock  curve  starting  from  this  point. 


10.3  Consider  the  traffic  equation 


du 

~dt 


+  (1 


with  initial  data 


v  >  0, 

v  <  0. 


(a)  Sketch  the  characteristic  lines  for  this  initial  condition,  and  show  that  they  leave 
a  triangular  region  uncovered. 

(b)  Show  that  the  constant  solution  u(t,  x)  =  g(x)  satisfies  the  Rankine-Hugoniot 
condition  for  the  shock  curve  a  it)  =  0  and  thus  gives  a  weak-solution  of  the 
traffic  equation.  (This  solution  is  considered  unphysical  because  characteristic 
lines  emerge  from  the  shock  line  to  fill  the  triangular  region.) 

(c)  The  physical  solution  is  specified  by  an  entropy  condition  that  says  that  charac¬ 
teristic  lines  may  only  intersect  when  followed  forwards  in  time.  Show  that  the 
continuous  function 


u(t ,  x) 


x_ 
It  ’ 


X  >  t, 

— t  <  X  <  t, 

x  <  —t, 


satisfies  the  weak  solution  condition  (10.9)  (with  q  =  u  —  u 2),  as  well  as  this 
entropy  condition.  (This  type  of  solution  is  called  a  rarefaction  wave.) 


10.4  Define  /  e  L110C(M71)  by 


/(*)  = 


/+(*),  Xn  >  0, 

/-(*),  *n  <  0. 


with  f±  e  Cl(Rn). 

(a)  For  j  =  1 ,  . . . ,  n  —  1 ,  show  that  /  has  weak  partial  derivatives  given  by  for 

>  0. 
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(b)  Show  that  the  weak  partial  exists  and  is  given  by  ^  for  ±xn  >  0  only  if  / 


dx_ 

extends  to  a  continuous  function  at  xn  =  0. 


10.5  Let  D  C  I2  be  the  unit  disk  {r  <  1}  with  r  :=  |x|.  Consider  the  function 
u(x)  =  ra  with  a  g  R  constant. 


o 

(a)  Compute  the  ordinary  partial  derivatives  j  =  1 ,  2,  for  r  ^  0 


(b)  Show  that  for  a  >  —  1  these  partials  lie  in  L1  (D)  and  define  weak  derivatives. 

(c)  For  what  values  of  a  is  u  g 


10.6  In  M3  consider  the  equation 


A  u 


1,  r  <  a, 
0,  r  >  a, 


with  r  :=  |x|  and  a  >  0.  (With  appropriate  physical  constants  this  is  the  equation 
for  the  gravitational  potential  of  a  spherical  planet  of  radius  a.) 


(a)  Assuming  that  u  depends  only  on  r ,  formulate  a  weak  solution  condition  in  terms 
of  pairing  with  a  test  function  0(r)  with  0  g  C^t[0,  oo). 

(b)  Find  the  unique  solution  which  is  smooth  at  r  =  0  and  for  which  u(r )  — >  0  as 
r  ->  oo. 


10.7  Let  c  R”  be  bounded  with  piecewise  C1.  For  w  g  C2(S7)  and  /  g 
C°(£2),  suppose  that 


q  L 


Wu • V0 — /0 


d"x  =  0. 


(10.46) 


for  all  0  G  C00^).  Show  that  w  satisfies  the  Poisson  equation  with  Neumann  bound¬ 
ary  condition, 


-A  u  =  /, 


du 

du 


(10.47) 


(Thus  (10.46)  allows  a  weak  formulation  of  (10.47)  for  u  e 


Chapter  11 

Variational  Methods 


Recall  the  formulas  for  the  kinetic  and  potential  energy  of  a  solution  of  the  wave 
equation  derived  in  Sect.  4.7.  At  equilibrium  the  kinetic  energy  is  zero,  and  by  phys¬ 
ical  reasoning  the  system  should  occupy  a  state  of  minimum  potential  energy.  This 
suggests  a  strategy  of  reformulating  the  Laplace  equation,  which  models  the  equi¬ 
librium  state,  as  a  minimization  problem  for  the  energy. 

In  this  application,  the  potential  energy  term  from  the  wave  equation  is  called  the 
Dirichlet  energy.  For  a  bounded  domain  Q  C  and  w  e  C2(E2),  let 

£[w]  :=  -  f  |Vw|2  dnx.  (11.1) 

2  Ji2 

The  term  functional  is  used  to  describe  functions  such  as  £[•],  to  indicate  that  the 
domain  is  a  function  space. 

To  see  how  minimization  of  energy  is  related  to  the  Laplace  equation,  let  us 
suppose  that  u  e  C2(£2\  R)  satisfies 

£[u]  <  £[u  +  (p] 

for  all  (p  e  C^t(£2;  R).  This  implies  that  for  t  e  R  the  function  t  —>  £  [u  +  tip] 
achieves  a  global  minimum  at  t  =  0.  Hence 

d  „ 

— £[u-\-tcp]  =0.  (11.2) 

dt  t= o 

Differentiation  under  the  integral  in  the  definition  of  £\u  +  tip\  gives 


The  original  version  of  the  book  was  revised:  Belated  corrections  from  author  have  been  incorpo¬ 
rated.  The  erratum  to  the  book  is  available  at  https://doi.org/10.1007/978-3-319-48936-0_14 


©  Springer  International  Publishing  AG  2016 
D.  Borthwick,  Introduction  to  Partial  Differential  Equations , 
Universitext,  DOI  10.1007/978-3-319-48936-0_ll 
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d 

dt 


£\u  +  ttp\ 


1  d 

t= o  2  dt  .in 


I  (\Vu\2 +  2tVu  ■  Vip  +  t2\Vip\2)dnx 

JQ 


t= 0 


-/ 


=  /  Vw  •  Vy?  rTx. 


By  Green’s  first  identity  (Theorem  2.10)  and  the  fact  that  cp  vanishes  on  dQ, 

/  Vu'Vpdnx  =  —  /  (pAudnx. 

J  £2  J  £2 

Thus  (1 1.2)  is  equivalent  to 

I  (pAu  dnx  =  0. 

JC2 


This  holds  for  all  <p  e  C^>t(^2;  R)  if  and  only  if  A u  =  0  on  Q 


11.1  Model  Problem:  The  Poisson  Equation 

The  empirical  law  describing  the  electric  field  in  the  presence  of  a  charge  distribution 
was  formulated  by  Gauss  in  the  mid-  19th  century.  Gauss’s  law  states  that  the  outward 
flux  of  the  electric  field  through  a  closed  surface  is  proportional  to  the  total  electric 
charge  contained  within  the  region  bounded  by  the  surface.  More  specifically,  if 
12  C  M3  is  a  bounded  domain  with  piecewise  C1  boundary  dQ  and  p  is  the  charge 
density  within  £2,  then 


E  •  v  dS  =  4nk 


where  k  is  called  Coulomb’s  constant. 

Using  the  divergence  theorem  (Theorem  2.6),  we  can  rewrite  the  flux  integral  as 


E  •  v  dS 


V  •  E  d3x, 


so  that  Gauss’s  law  becomes 


V  •  E  d3 x  =  4nk 


This  holds  for  an  arbitrary  region  if  and  only  if 


V  •  E  =  4i rkp 


(11.3) 


(the  differential  form  of  Gauss’s  law). 
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In  Sect.  9.1  we  noted  that  the  electric  potential  <p  and  electric  field  E  are  related 

E  =  -V(/>. 


Substituting  this  into  (11.3)  gives  the  Poisson  equation 


—  Acj)  =  4i\kp. 


A  common  electrostatics  problem  is  to  find  the  electric  potential  caused  by  a  distribu¬ 
tion  of  charges  within  a  region  £2  bounded  by  a  conducting  material.  The  electric  field 
must  be  perpendicular  to  a  conducting  surface,  implying  that  the  boundary  restriction 
(plan  is  constant.  If  the  boundary  is  connected  then  we  can  set  this  constant  to  0,  so 
the  potential  satisfies  the  Poisson  equation  with  Dirichlet  boundary  conditions. 

Other  forms  of  the  Poisson  equation  appear  in  various  contexts.  For  example  in 
Newtonian  gravity  the  relationship  between  gravitational  potential  0  and  the  mass 
density  p  is 

A0  =  4tt  Gp, 


where  G  is  Newton’s  gravitational  constant.  In  this  application  the  domain  is  M3, 
and  the  physical  assumption  that  0  — >  0  at  infinity  plays  the  role  of  a  boundary 
condition. 


11.2  Dirichlet’s  Principle 

To  solve  Poisson’s  equation  using  a  minimization  argument,  it  proves  to  be  very 
helpful  to  use  the  weak  formulation  of  the  equation.  This  is  because  Hq(£2)  is  a 
Hilbert  space.  The  completeness  of  //(J  (£2)  with  respect  to  the  Hl  norm  will  play  an 
essential  role  in  establishing  the  existence  of  a  minimizing  function. 

For  £2  c  M77  bounded,  and  /  e  C°(£2),  the  classical  Poisson  problem  is  to  find 
a  function  u  e  C2(£2)  n  C°(£2)  so  that 

-A  u  =  f,  u\9q=0.  (11.4) 

The  weak  formulation  of  (1 1.4)  is  a  special  case  of  (10.34),  We  take  /  e  L2(£2)  and 
the  goal  is  to  find  u  e  Hq  (£2)  such  that 

/  [Vw  •  VV>  -  f^\  dnx  =  0,  (11.5) 


for  every  ip  e  C™t(£2). 

For  convenience  let  us  consider  real- valued  functions.  (In  the  complex  case  we 
could  split  the  Poisson  problem  into  real  and  imaginary  parts.)  In  view  of  (1 1.5),  we 
define  the  functional 
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Vf[w]  :=  £[w]  -  (/,  w)  ,  (11.6) 

for  /  g  L2(f2;  R)  and  w  G  Hq(Q\  R),  where  £[•]  is  the  Dirichlet  energy  (11.1). 

Theorem  11.1  (Dirichlet’s  principle)  Suppose  Q  C  R”  is  a  bounded  domain  and 
f  g  R).  Ifu  G  Hq(£2\  R)  satisfies 


Vf\u\  <  T>f[w] 

for  all  w  G  Hq  (£2 ;  R),  then  u  is  a  weak  solution  of  the  Poisson  equation,  in  the  sense 

of(  11.5). 

Proof  Since  R)  C  Hq(£2),  the  assumption  on  u  implies 

T>f[u\  <  Vf\u  +  tip] 


for  ip  g  R)  and  t  G  R.  Therefore 


°=  JtVf[U  +  t^]  r=  0 

=  ^  “</’«  +  rV;>) 

=  /  [Vm  •  -  /V>]  d"*. 

JQ 


t= 0 


□ 

Dirichlet’s  principle  is  a  classic  example  of  a  variational  method.  This  terminol¬ 
ogy  refers  to  the  family  of  variations  u  +  tfi>  used  to  derive  the  PDE  from  the  min¬ 
imization  problem.  In  Sect.  11.3  we  will  show  that  attains  a  minimum  within 
Hq  (£2),  guaranteeing  the  existence  of  a  weak  solution.  Furthermore,  in  Sect.  1 1 .4  we 
will  see  that  the  weak  solution  is  actually  a  classical  solution  under  certain  conditions. 


11.3  Coercivity  and  Existence  of  a  Minimum 

The  functional  Vf  defined  in  (11.6)  consists  of  a  quadratic  term  plus  a  linear  term. 
The  Dirichlet  minimization  problem  is  thus  analogous  to  minimizing  the  polynomial 
ax2  +  bx  for  x  G  R.  This  polynomial  obviously  has  a  minimum  if  and  only  if  a  >  0. 
For  the  Dirichlet  case  the  analogous  condition  is  a  lower  bound  on  the  quadratic  term 
£[•].  The  original  form  of  this  result  was  proven  by  Henri  Poincare. 

Theorem  11.2  (Poincare  Inequality)  For  a  bounded  domain  Q  C  W1,  there  is  a 
constant  n  >  0,  depending  only  on  Q,  such  that 
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u 


i2  /  .2 


<  k,  S\u\ 


for  all  u  e  Hq(Q). 

Proof  Because  C™t(Q)  is  dense  in  Hq(£2),  we  can  restrict  our  attention  to  smooth 
compactly  supported  functions  at  first  and  extend  to  the  general  case  later.  To  illustrate 
the  core  argument  we  start  with  the  n  =  1  case. 

Consider  a  bounded  interval  (a,  b)  c  M.Fort/^  e  C™t  (a,b),  our  goal  is  to  compare 
the  size  of  to  values  of  its  derivative  .  The  obvious  connection  between  them 
comes  from  the  fundamental  theorem  of  calculus: 

X(x)  =  (  f'(t)  dt. 

J  a 

The  right-hand  side  can  be  written  as  an  L 2  pairing  with  the  characteristic  function 
of  [a,  x], 

1p(x)  =  [lp\  X[a,x])  ■ 


The  Cauchy-Schwarz  inequality  (Theorem  7.1)  then  gives 

\tp(x)\2  <  \\X[a,x]\\l  W\\22 
<  (b  -  a)£[ip]. 


for  all  v  e  [a,  b].  We  can  integrate  this  estimate  over  x  to  obtain 

ml  <(b-  a)2£[tp] 


for  f  e  C™t(a,  b). 

Now  let  us  consider  the  higher  dimensional  case  Q  cl".  The  domain  is  assumed 
to  be  bounded,  so 

Q  C  n  :=  [-M,  MX 


for  some  large  M.  Functions  in  can  be  extended  by  zero  to  smooth  functions 

on  7 Z,  so  it  suffices  to  derive  the  Poincare  inequality  for  X  e  C^t(  1Z).  Following 
the  one-dimensional  case,  we  apply  the  fundamental  theorem  of  calculus  in  the  x\ 
variable  to  write 


1p{x)  = 


l 


xi 


-M  dx\ 


(y,x 2,  ...,xn)dy. 


By  the  Cauchy-Schwarz  inequality  on  L2(— M,  M), 


\X(x)\2  <  2 M 


/M 

-M 


df 

dx\ 


(y,x 2,  ...,xn) 


2 
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for  all  x  g  7Z.  Integrating  this  estimate  over  x  e  1Z  yields 


|  <  4 M2 


dip 

dx\ 


By  the  definition  (11.1),  the  energy  is  given  by 


S\ip]  =  -  f  Vip-Vip  dnx 

2  Jn 


If  ( dip 

2  Jn  \cbri 


1  n 
E 


+  ■■■  + 


dip 

dxn 


dnx 


2  II  dxj 
j= i  J 


dip  2 
2 


(11.7) 


Thus  the  bound  (11.7)  implies  the  estimate, 


l  <  8M2£[ip], 


(11.8) 


fort p  6 

To  complete  the  argument,  suppose  that  u  e  H(\  (£2).  By  the  definition  of  //J 
there  exists  an  approximating  sequence  {t pk)  e  C“t(f?)  such  that  ipk  — >  u  in  the  Hl 
norm.  By  (11.8)  the  inequality 


HkWl  <  8M2f [t/’i] 


(11.9) 


holds  for  each  ieN.  Our  goal  is  thus  to  take  the  limit  k  — >  oo  on  both  sides. 
For  the  energy  side  note  that 


n 


£[u]  -  £bpk\  = 


E 

j= i 


du  du 


dxj  ’  ftxy 


dXj  ’  cfofy 


E 

7  =  1 


d(u  -ipk)  du\  I  dipk  d(u  -  ipk) 


dxj 


'  dxj 


dxj' 


dxj 


Hence  by  the  Cauchy-Schwarz  inequality 


n 


\£[u]  -  £[xpk]\  <  J2 

7  =  1 


d(u  -  ipk) 


dxj 


du 

dxj 


+ 


dtpk 

dxj 


By  the  definition  of  the  H 1  norm  this  yields 


I £[u]  -  £[ipk]\  <  || m  -  ipk\\Hi  (||m||h>  +  WAWh') 
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In  particular,  >  u  in  Hl  implies  that 

lim  8[^k]  =  £[u]. 

k^oo 

Thus  taking  the  limit  k  oo  in  (11.9)  gives 

\\u\\\  <  8 M2£[u]. 


□ 

The  Poincare  constant  n(£2)  for  a  bounded  domain  Q  is  defined  to  be  the  optimal 
choice  of  k  in  Theorem  11.2.  In  other  words, 

||  W  ||  2 

k(Q)  :=  sup  - r. 

£[u]  2 


Our  proof  gives  the  rough  estimate 

n(£2)  <  Vs  diam (£2), 

which  is  rather  poor  compared  to  the  best  known  bounds.  In  Sect.  1 1 .5  we  will  estab¬ 
lish  a  direct  relationship  between  the  Poincare  constant  and  the  lowest  eigenvalue  of 
A  on  £2 . 

Since  our  goal  is  to  minimize  over  Hq ,  it  is  useful  to  express  the  conclusion 
of  Theorem  1 1.2  in  terms  of  the  Hl  norm.  Because 

\\u\\2Hi  =  \\uW2  +  £[u],  (11.10) 

(11.8)  is  equivalent  to 

INIhi  <  (k;2  +  1  )£\u]  (11.11) 

for  all  u  e  (£2  ).  A  quadratic  functional  on  a  Hilbert  space  is  called  coercive  if  its 
ratio  to  the  norm  squared  is  bounded  below.  The  Poincare  inequality  thus  states  that 
£[•]  is  coercive  on  Hq(£2). 

The  identity  (11.10)  also  gives  an  upper  bound, 

£[u]  <  \\ufHl.  (11.12) 

A  quadratic  functional  is  called  bounded  if  its  ratio  to  the  norm  squared  is  bounded 
above.  For  the  energy  this  condition  is  automatic. 

We  are  now  prepared  to  tackle  the  minimization  problem  for  Vf  [•],  by  exploiting 
the  fact  that  £[-]  is  both  coercive  and  bounded. 
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Theorem  11.3  For  a  bounded  domain  Q  cl”  and  f  e  L2(£2)  there  is  a  unique 
function  u  e  H{]  (L?)  such  that 


V/\u\  <  Vf[w] 

for  all  w  6  //(J  (L?),  where  £>/[•]  is  defined  by  (11.6). 
Proof  By  the  triangle  inequality, 


Vf[w]  >  £[w]  -  | (/,  u;) | . 


Applying  (11.11)  to  the  energy  and  the  Cauchy-Schwarz  inequality  to  the  inner 
product  gives 


Vf[w]  > 

> 


1 

n2  +  1 

1 

n2  +  1 


w\\2hx  ~  H/H2IMI2 
w\\2hx  -  ll/lhIMItfi. 


(11.13) 


The  right-hand  side  has  the  form  cx 2  —  bx  where  c  =  \/(n2  +  1 ),  b  =  \\fW2,  and 
x  =  ||  w  ||//i .  According  to  the  minimization  formula  for  a  quadratic  polynomial, 


^  is 

min((‘A  —  bx)  = - 

xeR  4  C 


for  c  >  0.  Applying  this  to  (11.13)  gives 


Vf[w]  > 


(11.14) 


for  w  e  Hq(Q). 
If  we  set 


do  :=  inf  Vf[w ], 

weHfiQ) 


then  (11.14)  shows  that  do  >  —00.  By  Lemma  2.1,  there  exists  a  sequence  of 
Wk  C  Hq  (£2)  so  that 

lim  T>f[wk\  =  do.  (11.15) 


Our  strategy  is  to  argue  that  the  sequence  {u^}  is  Cauchy  in  Hq(£2),  and  therefore 
converges  by  completeness. 

The  quadratic  structure  of  £[•]  implies  that 


£ 


u  +  v 


^ £[u\  +  ^£  [n] 


v] 


2 


(11.16) 
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for  al\u,  v  e  Hq(Q).  This  allows  us  to  compute 


V 


f 


Wk  +  w 


m 


l  l  l  l 

=  +  -£[wm]  -  -£[wk  -  wm]  -  -  (/,  wk  +  wm ) 

1  1  1 
=  - Vf[wk ]  +  - Vf[wm ]  -  -£[u>jfc  -  Wm\. 


Because  do  is  the  infimum  of  D  t  \-  ],  this  implies 

1  1  1 

do  <  - T>f[wk ]  +  -r>/[wm]  -  -  Wm ].  (11-17) 

Turning  this  inequality  around  gives 

£[wk  -  wm ]  <  2T)f[wk]  +  2Vf[wm]  -  Ado. 


By  (11.15), 

lim  (2T>f[wk]  +  2Vf[wm]  -  Ad0)  =  0, 

k,m^o o 

and  since  £[•]  >  0  this  yields 


lim  £[wk  —  wm]  =  0.  (11.18) 

k ,  m  — >•  0 

Using  the  coercivity  estimate  (11.1 1),  it  follows  from  (11.18)  that 

lim  \\wk  -  wm\\Hi  =  0, 

k ,  /72  — ^  0 

i.e.,  the  sequence  {w^}  is  Cauchy  in  Hq  (£2).  Therefore,  by  completeness,  there  exists 
a  function  u  e  H^(£2)  such  that 


u  := 


lim  Wk. 

k^o o 


Convergence  in  H{  implies  that 

Vf[u\  ~  lim  Vf[wk]  =  do. 

k^oo 


Therefore  u  minimizes 

To  see  that  the  minimizing  function  is  unique,  suppose  that  both  u\  and  U2  satisfy 
Vf\uj\  =  do.  By  the  same  reasoning  used  to  derive  (11.17)  we  have 

1  1  1 

do  =  ~Vf[ui]  +  -Vf[u2]  -£[u  i  u2\. 
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By  assumption  both  Vf\u{\  and  Vf\u2]  are  equal  to  do,  so  this  implies 

8[u\  —  u2]  —  0. 


Theorem  11.2  then  implies  \\u\  —  u2\\2  =  0,  so  that  u\  =  u2.  □ 

Note  the  crucial  role  that  completeness  plays  in  the  proof  of  Theorem  1 1.3.  If  we 
had  taken  the  domain  of  the  Dirichlet  energy  to  C2(X?),  then  there  would  be  no  way 
to  deduce  convergence  of  the  sequence  {w^}  from  the  energy  limit  (11.18). 

Corollary  11.4  For  f  e  L2  (£2)  the  weak  formulation  (11.5)  of  the  Poisson  equation 
admits  a  unique  solution  u  G  Hq(Q). 

Proof  Existence  of  the  solution  follows  from  Theorems  11.1  and  11.3.  To  prove 
uniqueness,  suppose  that  u\  and  U2  both  satisfy  (11.5).  Subtracting  the  equations 
gives 

I  V(M!  -  u2)  •  vfdnx  =  0 

Jn 

for  all  g  C^t(£?).  By  the  definition  (10.18)  of  Hq(Q)  we  can  take  a  sequence  of 
fk  £  Hq  (£2)  such  that  fk  — >  «i  —  u2  in  the  Hl  norm.  This  implies  in  particular  that 
Vvpk  V(m  1  —  u2)  in  the  L 2  sense,  so  that 

6[u\  —  u2 ]  =  lim  /  V(mi  —  u2)  •  dnx 

Jq 

=  0. 


It  follows  from  the  Poincare  inequality  (Theorem  11.2)  that  \\u\  —  u2\\  =0,  hence 
U\  =  u2.  □ 


11.4  Elliptic  Regularity 

If  —  A u  =  /  in  the  classical  sense,  then  /  has  a  level  of  differentiability  2  orders 
below  that  of  u.  Our  goal  in  this  section  is  to  develop  a  converse  to  this  statement 
that  allows  us  to  deduce  the  Sobolev  regularity  of  a  weak  solution  u  from  that  of  /. 
This  type  of  regularity  result  holds  for  elliptic  equations  in  general,  but  we  will  focus 
on  the  Laplacian  for  simplicity. 

To  avoid  complications  near  the  boundary  of  the  domain,  we  introduce  local 
versions  of  the  Sobolev  spaces.  For  Q  c  R”  let 

K 'C(S2)  :=  [u  e  L/oc(^);  ml;  e  Hm(£>)  for  all  k  e  C“(i2)}  . 
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Theorem  11.5  (Interior  regularity)  Suppose  that  u  G  H^oc(£2)  is  a  weak  solution  of 


—  A  u  =  /. 

Iff  e  H™c(£2)form  >  0  then  u  e  H{^+2(Q). 

The  Sobolev  embedding  theorem  (Theorem  10.11)  gives 

HfJQ)  C  Ck(S2) 


for  k  <  m  —  |.  Thus  Theorem  11.5  shows  that  the  weak  solution  of  the  Poisson 
equation  obtained  in  Corollary  11.4  is  a  classical  solution  provided  /  G 
with  m  >  |.  Furthermore,  if  /  g  then  u  g  C°°(£2)  also. 

It  is  possible  to  include  regularity  up  to  the  boundary,  although  this  is  technically 
much  more  difficult.  For  example,  if  dQ  is  C°°  and  u  e  Hq(Q)  is  a  weak  solution 
of  -  A u  =  f  for  /  g  C°°(J2),  then  u  e  C°°(p). 

Our  proof  of  Theorem  11.5  makes  use  of  the  Fourier  analysis  on  T77  that  we 
introduced  in  Sect.  10.4. 

Lemma  11.6  Suppose  that  u  e  Hl( T77)  solves  —A u  =  /  in  the  weak  sense.  If 
f  g  Hm  (T77 )  for  m  G  N0,  then  u  G  Hm+2(Jn). 

Proof  For  u  G  Hl( T77)  and  /  G  L2(T77),  we  assume  that 


[Vzr  •  S7f  —  ff>\  dnx  =  0, 


for  all  f  G  C°°(T).  Setting  fix)  =  e  lkx 


in  this  equation  gives 


(1U9) 


[—ik  •  Vu(x) 


/( x)]  e~ik  x  dnx  =  0. 


Using  10.26,  we  can  translate  this  into  a  relation  between  the  Fourier  coefficients, 


cdf]  = 


=  \k\2ck[u]. 


(11.20) 


According  to  Theorem  10.13,  if  /  G  Hm{ T77)  then 

\k\2m  |Cft[/]|2  <  00. 

keZn 


By  (11.20)  this  implies  that 
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\k\2m+4  \ck[u]\2  <  oo, 

keZn 


which  gives  u  g  Hm+2(Tn)  by  Theorem  10.13. 


□ 


We  can  now  deduce  the  interior  elliptic  regularity  result  from  Lemma  11.6  by 
localizing. 


Proof  of  Theorem  11.5.  Suppose  that  —  A u  =  f  in  the  weak  sense  of  (11.5),  with 
u  G  Hq  (L?)  and  /  G  L\oc(T2).  By  rescaling  and  translating,  if  necessary,  we  can 
assume  that  Q  C  [0,  27r]n,  which  allows  us  to  identify  Q  with  a  subset  of  Tn .  For 
X  G  C™t(T2)  we  can  extend  u\  by  zero  to  a  function  on  [0,  h r]n.  This  allows  us 
to  consider  ux  as  a  function  in  Hl(Tn).  Our  goal  is  to  apply  Lemma  11.6  to  the 
localized  function  ux • 

We  must  first  show  that  ux  satisfies  a  weak  Poisson  equation  on  T”.  Given  f  g 
C°°(TW)  we  can  use  the  test  function  x ^  £  C^t(£2)  in  the  weak  solution  condition 
(11.5)  to  obtain 


[Vm  •  V(x^) 


fxf]  dnx  =  0. 


Using  the  product  rule  for  the  gradient,  we  can  rewrite  this  as 


[xVw  •  +  fWu  •  Vx  -  /X^]  dnx 

[V(t/x)  •  -  mVx  •  +  ^Vm  •  Vx  -  /X^] 


(11.21) 


In  order  to  interpret  this  as  a  weak  equation  for  ux,  we  need  to  rewrite  the  second 
term  as  an  integral  involving  rather  than  Vf. 

Since  the  components  of  u  Vx  are  in  C^t(f2),  the  definition  of  the  weak  derivative 
Wu  gives 


Wu  •  (fWX)  dnx 


-  [  uV  -  (fWx)  dnx 

Jn 


/ 

J  £2 


—  I  [uVf  •  Vx  +  uipAx]  dnx 


Using  this  formula  on  the  term  t/Vx-V^in(11.21)  gives 


V (u\)  ■  X7ip  +  (2 Vw  •  Vx  +  U  Ax  -  fx )V> 


(11.22) 


This  holds  for  all  f  G  C°°(T"),  implying  that  ux  is  a  weak  solution,  in  the  sense  of 
(11.19),  of  the  equation 


A(«X)  =  -2V«  '  VX  -  «Ax  +  fx- 


(11.23) 
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Because  x  £  C^t(£?),  the  right-hand  side  of  (11.23)  lies  in  L2(Tn )  by  the 

assumptions  on  u  and  /.  Hence  ^x  £  //2(TW)  by  Lemma  11.6.  This  holds  for 
all  x  e  implying  that  u  E  H20C(Tn). 

We  can  now  apply  the  same  argument  inductively.  If  u  E  and  /  e 

H^~X  (£2),  then  the  right-hand  side  of  (11.23)  lies  in  Hq~l( T")  for  x  E  C™t(£2). 
Lemma  11.6  then  gives  u  E  □ 


11.5  Eigenvalues  by  Minimization 

In  Sect.  7.6,  we  mentioned  that  the  spectral  theorem  for  finite-dimensional  matrices 
has  an  extension  to  certain  differential  operators.  In  this  section,  we  will  prove  this 
result  in  the  classical  setting  of  the  Laplacian  on  a  bounded  domain  with  Dirichlet 
boundary  conditions. 

Theorem  11.7  (Spectral  theorem  for  the  Dirichlet  Laplacian)  Let  T2  C  M77  be  a 
bounded  domain.  There  exists  an  orthonormal  basis  {4>k)kenfor  L2(k2)  such  that 


—  A(j)k  =  A  k<t>k 


with  (j)k  E  Hq(Q\  R)  H  Furthermore,  \  >  0  for  all  k  and 

lim  A k  =  oo. 

k^oo 

In  the  case  of  L2[0,  tt],  the  sequence  of  Dirichlet  eigenfunctions  is  given  by 
cj)k(x)  =  sin(Lv)  for  k  e  N,  with  eigenvalues  A k  =  k2.  If  follows  from  Theorem  8.6 
that  this  sequence  yields  a  basis,  as  shown  in  Exercise  8.4. 

We  can  solve  the  eigenvalue  problem  in  the  general  case  by  adapting  the  vari¬ 
ational  method  used  for  the  Poisson  equation  earlier  in  this  chapter.  This  method 
gives  more  than  just  existence  of  the  basis;  it  also  suggests  a  natural  strategy  for 
numerical  approximation  of  eigenvalues  and  eigenfunctions,  which  we  will  explore 
in  Sect.  11.7. 

The  weak  formulation  of  the  eigenvalue  equation  —A =  Xcj)  with  Dirichlet 
boundary  conditions  is  a  special  case  of  (10.34).  For  <fi  e  Hq(£2),  the  condition  is 
that 

I  [V0-  VV^-  A^Vd  dnx  =  0  (11.24) 

J  & 


for  all  ^  e  C^t(k2). 

If  we  substitute  ip  in  place  of  ip  then  the  second  term  in  (11.24)  is  A  times  the  L 2 
inner  product  (</>,  pj) .  By  the  same  token  we  could  interpret  the  first  term  as  an  “inner 
product”  form  of  the  Dirichlet  energy  (11.1).  We  will  denote  this  by 
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=-/ 

JC2 


S[u,  v]  :=  /  Vu-Vvdnx, 


so  that  £[w]  =  £[w,  m].  The  L2  and  Hl  inner  products  are  related  by 

(u,  v) Hi  =  (u,  v)  +  S[u,  v]. 

With  this  convention  we  can  write  the  weak  eigenvalue  equation  (1 1.24)  in  an  equiv¬ 
alent  form  as 

£[(/>,  ^]  =  A  (</>,  i/j)  (11.25) 


for  all  ^  e  C™t(£2). 

For  the  minimization  argument,  it  will  prove  convenient  to  enlarge  the  space  of  test 
functions  from  to  Hq(Q).  To  justify  this,  note  that  by  definition  a  function 

v  e  Hq  (L?)  can  be  approximated  by  a  sequence  of  ipk  £  C^t(I2)  with  respect  to  the 
Hl  norm.  This  implies  in  particular  that 

lim  £[</>,  2pk]  =  £[(/),  v],  lim  (0,  ^k)  =  (0,  v) . 

k^oo  k — >oo 

We  can  thus  conclude  that  <fi  solves  (1 1.24)  if  and  only  if 

£[</>,  v]  =  \(</>,v)  (11.26) 


for  all  v  G  Hq  (L?). 

The  formulation  of  the  eigenvalue  equation  as  a  minimization  problem  is  known 
as  Rayleigh’s  principle ,  after  the  physicist  Lord  Rayleigh.  For  v  G  Hq(£2),  v  ^  0, 
the  ratio 


K[v]  := 


£[v] 


II 2 

II 2 


is  called  the  Rayleigh  quotient  for  v.  Note  also  that  if  </>  satisfies  (11.26)  then 


K[</)]  =  A. 


(11.27) 


Furthermore,  the  Poincare  inequality  (Theorem  11.2)  shows  that 

K[v]>\>0  (11.28) 

for  v  g  Hq(£2),  v  0.  This  suggests  that  the  smallest  eigenvalue  is  related  to  the 
Poincare  constant,  and  that  we  can  locate  it  by  minimizing  K[-]. 

The  argument  for  existence  of  a  minimum  is  a  little  trickier  than  the  analysis  of 
the  Dirichlet  principle  in  Sect.  11.3.  To  understand  why,  note  that 


7 Z[cv]  =  lZ[v] 
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for  c  e  C\{0},  so  the  minimizing  function  is  not  unique.  Therefore  it  is  quite  possible 
to  have  a  sequence  that  minimizes  the  Rayleigh  quotient  but  does  not  converge  in  Hl . 

The  tool  that  allows  us  to  resolve  this  issue  was  developed  by  Franz  Rellich  in  the 
early  20th  century. 

Theorem  11.8  (Rellich’ s  theorem)  Suppose  <£2  C  R"  is  a  bounded  domain  and  {14} 
is  a  sequence  in  H{]  (T2)  that  satisfies  a  uniform  bound 


Vk\\H'  <  M 


for  all  k  G  N.  Then  {1^}  has  a  subsequence  that  converges  in  L2(T2). 

It  is  important  to  note  that  Rellich ’s  theorem  refers  to  two  different  norms.  The 
sequence  is  assumed  bounded  in  the  H[  norm,  and  the  subsequence  is  guaranteed 
to  converge  with  respect  to  the  L 2  norm.  This  is  crucial  to  the  result  and  not  a  mere 
technicality.  We  will  defer  the  proof  of  Rellich’s  theorem  to  Sect.  1 1 .6.  The  remainder 
of  this  section  is  devoted  to  its  application  to  the  Rayleigh  minimization  scheme. 

Theorem  11.9  (First  eigenvalue)  There  exists  fix  G  (L?;  R)  Fi  C°°(T2)  satisfying 


—  A(/>i  =  A  i0i 


for  Ai  >  0,  such  that 


Ai  <  7Z[v] 


(11.29) 


for  all  v  G  H{]  (fi),  v  7^  0. 

Proof  By  (11.28)  the  values  of  7Z[-]  are  bounded  below  by  the  Poincare  constant. 
Therefore,  the  infimum 

Ai  :=  inf  7Z[v] 

veHfiC2)\{0} 


exists  and  is  strictly  positive. 

By  Lemma  2.1  there  exists  a  sequence  {14}  C  Hq(£2)\  {0}  such  that 


lim  ll[vk]  =  Ai.  (11.30) 

k^oo 

After  rescaling  each  v k  by  a  constant  we  can  assume  that  \\vkW2  =  1,  so  that 


Tl\  vk  ]  =  £[vk\- 

The  sequence  of  energies  £[vk ]  is  bounded  by  (11.30).  This  also  implies  that  the 
sequence  {vk}  is  bounded  with  respect  to  the  Hl  norm,  because 

II  u*  ll^i  =  1  +£[vk~\ 


by  the  relation  (11.10). 
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According  to  Theorem  11.8  there  exists  a  subsequence  of  {i^}  that  converges  in 
the  L 2  sense  to  some  function  <p\  e  L2.  By  restrict  our  attention  to  this  subsequence 
and  relabeling  if  needed,  we  can  assume  that 


lim  \\vk  -  0i||2  =  0. 

k^OQ 


(11.31) 


The  next  goal  is  to  improve  this  to  a  statement  of  convergence  in  Hl .  We  use  the 
same  strategy  as  in  Sect.  11.3,  starting  from 


£ 


vk  +  v 


m 


-  2S[Vk]+  2£lVm]  £[Vk  Vml 


(11.32) 


By  the  definition  of  A 


n 


Vk  +  V 


m 


>  Ai. 


Therefore 


£ 


vk  +  v 


m 


A  i  2 

—  1 1 2 


Using  this  in  (11.32)  gives 

£[vk  ~  Vm ]  <  2£[vk\  +  2£[vm\  -  Aillwjt  +  vm\\\.  (11.33) 

By  (11.31),  we  have 

lim  Hvjt  +  I’m  111  =  II  2<^>i  ||  2  =  4, 

k ,  m  — >  oo 

and  by  construction  £[vk\  —>■  -  Hence  (11.33)  implies  that 

lim  £[vk  -  vm]  ->  0. 

k,m—>oo 

Since  we  already  know  that  {u^J  converges  in  the  L2  norm,  we  conclude  that 

lim  Hu*  -  vm\\Hi  ->  0. 

k,m^oo 

That  is,  the  sequence  {14}  is  Cauchy  with  respect  to  the  Hl  norm. 

By  completeness  {14}  converges  with  respect  to  the  Hl  norm  to  some  u  e  H(]  (f2). 
Since  the  L 2  norm  is  bounded  above  by  the  Hl  norm,  this  means  14  u  in  the 
L 2  sense  also.  Hence  u  =  (p  1,  which  proves  that  pi  e  //(J  (f2).  It  then  follows  from 
(11.30)  that 


£[pi]  =  n[Pi\  =  A,. 
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To  see  that  this  implies  the  weak  solution  condition,  suppose  w  e  Hq(Q).  Using 
the  inner  product  form  of  f  [•],  we  can  expand 


f [<pi  +  tw]  =  Ai  +  2t  Ref  [0i,  w]  +  t2S[w], 


(11.34) 


for  t  e  R.  Similarly, 


2  1 1 _ 1 1 2 


||(/>i  +  tw || 2  =  1  +  2^  Re  (0i,  tu)  +  t  \\w 


(11.35) 


By  the  definition  of  Ai,  the  function  t  i->  7£[0i  +  tw]  has  a  minimum  at  t  =  0,  so 
that 

d 

— 7£[0i  +  tw]  =  0. 

dt  t= o 


Computing  this  derivative  using  (11.34)  and  (11.35)  gives 


d 

dt 


lZ[(j)i  +  tw] 


t= o 


=  2 Ref  [0i,  w]  —  2Ai  Re  (0i,  tu) 


We  thus  conclude  that 


Ref  [0i,  w]  =  Ai  Re  (0i,  w) 


for  all  w  e  By  replacing  w  by  iw,  we  can  deduce  also  that 

Imf  [0i,  w]  =  Ai  Im  (0i,  w) 

for  all  w  E  Hq(Q).  In  combination  these  give  (11.26),  so  0i  is  a  weak  solution  of 


—  A0i  =  A 1 0i 


In  principle,  0i  could  be  complex- valued  at  this  point,  but  since  its  real  and  imag¬ 
inary  parts  each  satisfy  (11.26)  separately,  we  can  select  one  of  these  to  specialize 
to  the  real- valued  case. 

To  deduce  the  regularity  of  0i,  we  apply  Theorem  11.5  with  /  =  A0i.  The  fact 
that  0i  e  Hq  (f?)  then  implies  that  0i  e  H^oc(f2).  Starting  from  H\oc(£2)  then  gives 
0i  E  //]QC(f2),  and  so  on.  This  inductive  argument  shows  that  0i  E  H^0C(Q)  for  each 
q  E  N.  We  conclude  that  0i  E  C°°(£2)  by  Theorem  10.11.  □ 

It  is  clear  that  the  Ai  produced  in  Theorem  1 1.9  is  the  smallest  eigenvalue,  since 
all  eigenvalues  occur  as  values  of  the  Rayleigh  quotient  by  (1 1 .27).  An  example  of  0i 
is  shown  in  Fig.  11.1.  We  will  see  in  the  exercises  that  the  first  eigenfunction  cannot 
have  zeros  in  Q .  This  can  be  used  to  show  that  the  first  eigenfunction  is  unique  up  to 
a  multiplicative  constant,  i.e.,  Ai  has  multiplicity  one. 

To  find  other  eigenvalues,  the  strategy  is  to  restrict  to  subspaces  and  then  apply  the 
same  construction  used  for  Ai .  For  a  subset  A  e  L2  (f?)  the  orthogonal  complement  is 
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Fig.  11.1  The  first 
eigenfunction  of  a  equilateral 
triangle  domain 


Al  :=  { w  e  L2(£?);  (w,  v)  =  0  for  all  v  e  A] . 

The  orthogonal  complement  of  a  set  is  a  subspace,  by  the  linearity  of  the  inner 
product. 

In  the  argument  below  we  will  consider  subspaces  of  Hq(£2)  of  the  form 

w  =  A±n  Hq(£2), 

where  A  is  a  finite  list  of  eigenfunctions.  We  claim  that  W  is  closed  as  a  subspace 
or  H(\  (Q).  To  see  this,  suppose  that  Wk  — >  w  in  the  Hl  norm,  with  Wk  £  W.  Since 
|| u>k  —  w  H2  <  ||  Wk  —  w\\Hi  by  (11.10),  this  implies  that  — >  w  with  respect  to  the 
L 2  norm  also.  Thus  for  v  e  A, 


(w,  v)  =  lim  (wk,  v)  =  0. 

k^oo 

This  shows  that  w  e  W .  Therefore  W  is  closed.  By  Lemma  7.8,  this  implies  that  W 
is  a  Hilbert  space  with  respect  to  the  Hl  inner  product 

Proof  of  Theorem  11.7  Let  f>\  e  Hq  {  \  M)  be  the  eigenvector  obtained  in 

Theorem  11.9,  normalized  so  that  WfiWi  =  L  The  subspace 


Wi  :=  n  Hq(Q) 


is  a  Hilbert  space  with  respect  to  the  Hl  norm,  by  the  remarks  above. 

Applying  the  minimization  procedure  used  in  Theorem  11.9  to  the  restriction  of 
the  Rayleigh  quotient  to  W\  gives  02  e  W\  such  that  ||  <0>2  II 2  =  1  and 

A2  :=U[(j)2\  <  K[w]  (11.36) 

for  all  w  e  Wi  \  {0}.  By  the  same  variational  argument  used  for  0i,  this  implies  that 

£[02,  w]  =  A2  (02,  w)  (11.37) 


for  all  w  g  W\. 
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To  extend  this  formula  to  the  full  weak  solution  condition,  note  that  (0 2,  0i)  =  0 
because  02  £  W\.  The  fact  that  0 1  satisfies  (11.26)  also  gives 


£[01,  02]  =  Ai  (01,  02 )  =  0. 


(11.38) 


Now  consider  a  general  v  e  Hq(Q).  For  c  :=  {v,  0 1)  we  have 

(v  -  C01,  01 )  =  0, 

so  that  v  —  c0i  g  IT] .  After  setting  w  :=  v  —  c0 1,  we  can  expand 

£[02,  U]  -  A2  (02,  10L2  =  <^£[02,  01  ]  -  A2  (02,  0l)) 

+  £[02,  w]  ~  A2  (02,  w) 

The  first  line  on  the  right  is  zero  by  (11.38)  and  the  second  is  zero  by  (11.37).  Thus 
(11.26)  is  satisfied  for  all  v  e  Hq(£2),  showing  that  —  A02  =  A 202  in  the  weak 
sense.  By  taking  the  real  or  imaginary  part  we  can  assume  that  02  is  real- valued. 

Subsequent  eigenvalues  are  obtained  by  repeating  this  process  inductively.  After 
k  eigenfunctions  have  been  found,  we  set 

Wk  :=  {<j>u  (f)k}L  n  Hq(S2),  (11.39) 

and  minimize  the  Rayleigh  quotient  over  Wk  to  find  A^+i  and  fa+i.  Note  that  Wk 
always  contains  nonzero  vectors,  because  Hq(Q)  is  infinite-dimensional.  The  reg¬ 
ularity  argument  from  the  end  of  the  proof  of  Theorem  1 1.9  applies  to  any  solution 
of  (11.26),  so  that  0^  e  C°°(£?)  for  each  k. 

This  process  produces  an  orthonormal  sequence  of  eigenfunctions  {0^ }  with  eigen¬ 
values  satisfying 

\\  £  A2  ^  A3  <  . . . . 

To  see  that  \  — >  00,  note  that  || 0^  ||2  =  1  and  7l[(j)k\  =  A^  by  construction,  so  that 

Il0*llffi  =  l+A*.  (11.40) 

Suppose  the  sequence  { A^}  is  bounded.  Then  (1 1.40)  shows  that  the  sequence  {0^}  is 
bounded  with  respect  to  the  Hl  norm.  Theorem  11.8  then  implies  that  a  subsequence 
of  {0£ }  converges  in  L2(£2).  But  the  0^  are  orthonormal  with  respect  to  the  L2  norm, 
so  that 

110/:  0m  II 2  =  V^2 

for  all  k  7^  m.  Convergence  of  a  subsequence  of  {0^}  is  therefore  impossible  in 
L2(£?).  This  contradiction  shows  that  { A^}  cannot  be  bounded.  Since  the  sequence 
is  increasing,  this  implies 


lim  A*;  =  00. 
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The  final  claim  is  that  {fa}  forms  an  orthonormal  basis  of  L2(T2).  After  obtaining 
the  full  sequence  from  the  inductive  procedure,  let  us  set 

Woo  :=  {01,02,...}-Ln//O1(i2). 

Suppose  that  W0 0  contains  a  nonzero  vector.  Applying  the  Rayleigh  quotient  mini¬ 
mization  as  above  produces  yet  another  eigenvalue  A.  Since  the  A^’s  were  constructed 
by  minimizing  the  Rayleigh  quotient  on  subspaces  Wk  D  W0 c,  this  new  eigenvalue 
satisfies  A  >  A^forallk  g  N.  This  is  impossible  because  A^  — >►  oo.  Hence  =  {0}. 

In  other  words,  the  only  vector  in  H{]  (  f2)  that  is  orthogonal  to  all  of  the  fa  is 
0.  Since  C™x(£2)  C  (f2)  and  C™x(£2)  is  dense  in  L2(C2)  by  Theorem7.5,  this 
implies  that  the  only  vector  in  L2(T2)  that  is  orthogonal  to  all  of  the  fa  is  0.  Hence 
{ fa }  is  a  basis  by  Theorem  7.10.  □ 


11.6  Sequential  Compactness 

In  this  section  we  take  up  the  proof  of  Rellich’s  theorem  (Theorem  11.8).  Results 
of  this  type,  that  force  convergence  of  an  approximating  sequence,  are  a  crucial 
component  of  variational  strategies  for  PDE. 

In  a  normed  vector  space,  a  subset  A  is  said  to  be  sequentially  compact  if  every 
sequence  within  A  contains  a  subsequence  converging  to  a  limit  in  A.  A  fundamental 
result  in  analysis  called  the  Bolzano- Weierstrass  theorem  (Theorem  A.  1)  says  that  in 
W1  this  is  equivalent  to  the  definition  of  compact  given  in  Sect.  2.3.  That  is,  a  subset 
of  W1  is  sequentially  compact  if  and  only  if  it  is  closed  and  bounded. 

Rellich’s  theorem  could  be  paraphrased  as  the  statement  that  a  closed  and  bounded 
subset  of  Hq(Q)  is  sequentially  compact  as  a  subset  of  L2(T2),  provided  we  are 
careful  about  the  two  different  norms  referenced  in  this  statement.  Our  strategy  will 
be  to  reduce  Rellich’s  theorem  to  an  application  of  Bolzano- Weierstrass  using  Fourier 
series.  We  start  with  the  periodic  case. 

Theorem  11.10  Suppose  that  { vj }  is  a  sequence  in  Hl(Tn)  that  satisfies  a  uniform 
bound 

\\Vj\\H'  <  M  (11.41) 

for  all  j  G  N.  Then  [vj]  has  a  subsequence  that  converges  in  L2( T”). 

Proof  The  argument  is  essentially  the  same  in  any  dimension,  so  let  us  take  n  =  1  to 
simplify  the  notation.  Suppose  that  {ny}  is  a  sequence  in  //^T)  satisfying  (11.41). 
The  periodic  Fourier  coefficients  are  defined  by 

1  C71 

ck[vj]  :=  —  J  Vj(x)e~lkx  dx  (11.42) 
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for  k  e  Z,  with  corresponding  partial  sums 

m 

Sm[Vj]  =  J2  ck[VjVkx. 

k=—m 


Applying  the  Cauchy- Schwarz  inequality  (Theorem  7.1)  to  (11.42)  gives 


2- 


The  assumption  (11.41)  implies  also  that  the  L 2  norms  \\vj  H2 


that 


are  bounded  by  M,  so 


for  all  j  e  N  and  k  e  Z. 

We  start  the  process  of  finding  a  convergent  subsequence  by  making  the  first 
few  coefficients  converge.  The  collection  of  points  (c_i ] ,  co[Vj],  c\[Vj])  forms 
a  bounded  sequence  in  C3.  Applying  Bolzano- Weierstrass  (Theorem  A.l)  to  this 
sequence  gives  a  point  (a_i,  ao,  a\)  C  C3  and  a  subsequence 

Hi  c 


such  that 


lim  ck 


•OO 


vf 


=  ak 


for  k  =  —1,0,  1 .  Since  the  partial  sum  S 1  involves  only  these  three  coefficients,  this 
implies  the  uniform  convergence 


lim  S 1 

j^oo 


which  also  gives  convergence  in  the  L2  sense. 

The  same  reasoning  can  be  applied  to  the  Fourier  coefficients  with  k 
to  obtain  a  subsequence 


such  that 


lim  ck 

j^oo 


ak 


for  k  =  —  2,  . . . ,  2.  This  process  can  be  continued  inductively  to  produce  a  family 
of  subsequences  j  \  such  that 
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lim  Si 

j^oo 


l 

V  ake 

k=—l 


ikx 


(11.43) 


in  L2(T). 

To  complete  the  proof,  set 


Because  Wj  is  an  element  of  the  / th  subsequence  for  /  <  j ,  we  deduce  from  (1 1.43) 
that 


lim 

j^oo 


Sm  [w;]  =  ^  akelkx 

k=—m 


(11.44) 


in  L2(T)  for  all  me  N. 

We  now  claim  that  the  sequence  Wj  converges  in  L2(T).  In  order  to  deduce  this 
from  (11 .44)  we  need  to  control  the  rate  at  which  Sm  [wj  ]  converges  to  Wj  as  m  — >►  oo. 
This  is  where  the  Hl  bound  (1 1.41)  becomes  crucial.  In  terms  of  Fourier  coefficients, 
the  Hl  norm  could  be  written 

oo 

ii/iitf*  =  E  a+fc2)icn/]i2. 

k=—o o 


Hence  (11.41)  implies  that 


J2  +  k2)  \ck[w j]\2 


k=—oo 

for  all  j.  This  leads  to  an  estimate: 


<  M2 


Wj  -  Sm[Wj] || 


=  J2 

\k\>m 


< 


E) 

\k\>m 

M2 


1  +k: 


+  m* 


ck[Wj] 


(11.45) 


< 


1  +  m2  ’ 


independent  of  j . 

By  the  triangle  inequality, 

II Wi  ~  Wj\\2  <  II Wi  -  5m[iyi]||2  +  II Sm[wi]  -  Sm[wj] ||2  +  || wj  -  Sm[wj ]||2. 
Given  £  >  0,  fix  m  large  enough  that 
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M 

-  <  £. 

Vl  +  m2 

By  (11.45)  this  reduces  our  triangle  estimate  to 

II  Wj  Wj  || 2  <  \\Sm[Wi]  -  Sm[Wj]\\2  +  2s.  (11.46) 

Since  Sm[w j]  converges  in  L2(T)  by  (11.44),  we  can  choose  N  sufficiently  large  so 
that 

II  Sm  Y^i  ]  [tUj  ]  II L2  ^  £ 

for  all  i,  j  >  N.  By  (11.46)  this  implies  that 

II  Wi  —  Wj  || 2  <  3s 

for  i.j  >  N.  This  shows  that  the  subsequence  [w j }  is  Cauchy  with  respect  to  the  L2 
norm.  By  completeness  the  subsequence  converges  in  L2(T).  □ 

Proof  of  Theorem  11.8  We  can  assumed  C  (—7 r,  n)n,  after  rescaling  if  needed.  By 
Lemma  10.10,  elements  of  H^(T2)  can  be  extended  by  zero  to  n]n).  We 

can  then  make  these  functions  periodic  to  give  the  inclusion 

H^(T2)  C  Hl(Tn).  (11.47) 

Given  a  sequence  { Vj }  C  (Q)  that  is  uniformly  bounded  in  the  H 1  norm,  applying 
Theorem  11.10  to  the  extension  given  by  (11.47)  gives  a  subsequence  [wj]  that 
converges  in  L2( Tn).  By  construction  the  restriction  of  wj  to  [— n,  n]n  vanishes 
outside  Q,  so  this  also  gives  a  convergent  subsequence  in  L2(T2).  □ 


11.7  Estimation  of  Eigenvalues 

As  we  saw  in  Sect.  11.5,  the  Rayleigh  principle  for  the  eigenvalue  problem  is  very 
useful  as  a  theoretical  tool.  It  also  leads  to  some  very  practical  applications  in  terms 
of  estimating  eigenvalues  or  calculating  them  numerically. 

The  basic  strategy  is  to  exploit  the  formula  for  eigenvalues  that  appeared  in  the 
proof  of  Theorem  1 1.7.  Assuming  the  Dirichlet  eigenvalues  { A^}  of  Q  are  written  in 
increasing  order  and  repeated  according  to  multiplicity, 

\k  =  min  7 Z[w],  (11.48) 

weWk-XiO} 


where 


Wk- 1  :=  {(j)  1, . . . ,  <t>k- i)x  n  Hq(^2), 
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with  <pk  the  eigenfunction  corresponding  to  Xk  The  only  problem  with  this  formula 
is  that  determining  the  kth  eigenfunction  requires  knowledge  of  the  first  k  —  1  eigen¬ 
functions.  This  issue  is  resolved  by  the  following: 

Theorem  11.11  (Minimax  principle)  For  abounded  domain  Q  C  W1,  let  Ak  denote 
the  set  of  all  k-dimensional  subspaces  of  Hq(Q).  Let  { A^}  denote  the  sequence  of 
Dirichlet  eigenvalues  of  Q  in  increasing  order.  Then 


min  max  7 Z[u] 

VeAk  weV\{0} 


(11.49) 


for  each  k  G  N. 

Proof  Let  {fk)  C  Hq(Q)  denote  the  eigenfunction  basis.  By  the  weak  solution 
condition  (11.25),  orthonomality  in  L2(T2)  implies  also  that 


Let  us  set 


£[<&,  <l>j] 


0, 


i  =  j, 
i  -f—  j- 


Veig  =  [</>l,  .  .  .  ,  <t>k\  C  Hq(Q), 


(11.50) 


where  [. . .  ]  denotes  the  linear  span  of  a  collection  of  vectors.  For  u  g  Veig>  expanded 
as 

k 

U  =  ^  '  C  j  fk  > 

7  =  1 


we  see  from  (11.50)  that 


7  =  1 


cj 


The  Rayleigh  quotient  is  thus 


7 Z[u]  = 


ci 


Yk 

A/  =  l 


c 


Since  Ai  <  •  •  •  <  A^  by  assumption,  it  is  clear  that 


hZ[u]  <  A k 


for  u  g  Veig\  {0}-  Moreover,  7 Z[<fk\  =  A^,  so  that 


max  7 Z[u]  =  Xk , 

ueVe  ig\{0} 


(11.51) 
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Now  consider  a  general  subspace  V  e  A&.  Since  dim[(/>i, . . . ,  fk-i]  =  k  —  1, 
there  exists  a  nonzero  vector 


U)  e  V  n  [0i, . . . ,  4>k-i]± ■ 
Since  w  e  [f  ,  fk- i]x,  it  follows  from  (11.48)  that 


A k  < 


Hence, 

A*;  <  max  7£[w]  (11.52) 

weV\{0} 

for  V  €  A*. 

In  combination,  (11.51)  and  (11.52)  show  that  the  minimum  on  the  right-hand 
side  of  (11.49)  exists  and  is  equal  to  A □ 

As  a  sample  application,  we  can  use  the  minimax  principle  to  compare  eigenvalues 
of  nested  domains.  This  is  possible  because  of  the  inclusion 

Hq(Q)  c  Hq(&),  (11.53) 


provided  by  Lemma  10.10. 

Corollary  11.12  Consider  two  bounded  domains  in  W1  satisfying  Q  C  Assum¬ 
ing  the  Dirichlet  eigenvalue  sequences  are  arranged  in  increasing  order, 

Afc(*£2)  5  A*:(£2) 


for  all  k  e  N. 

Proof  Let  {<fk)  C  H^{Q)  be  the  sequence  of  eigenfunctions  of  <£2.  By  (11.53), 
we  can  consider  [fi, . . . ,  fk\  to  be  a  subspace  of  Hq(Q).  If  A k  denotes  the  set  of 
^-dimensional  subspaces  of  Hq(Q),  then  this  implies 


min  max  7 Z[u\ 

WeAk  l«etV\{0} 


<  max  7 Z[u] 

MG[01,...,0jfc]\{O} 


By  Theorem  11.11,  the  left-hand  side  is  A k(&),  while  the  right-hand  side  equals 
Ak(fi)  by  (11.51).  □ 

Another  way  to  make  use  the  Rayleigh  and  minimax  principles  is  to  approxi¬ 
mate  eigenvalues  by  restricting  subspaces  of  computationally  simple  functions  within 
//q1  (£2).  This  approach  can  give  surprisingly  good  estimates  even  when  the  subspaces 
are  small. 
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Example  11.13  On  the  unit  disk  I  =  {r  <  1}  C  K2,  consider  the  family  of  functions 


wa(r)  :=  1  -  r 


a 


for  a  >  0.  Because  wa  is  radial,  the  energy  can  be  computed  by 


The  L 2  norm  is 


£[Wa\  = 


1  (  dwa 


a 


dr 


2irr  dr 


f 


=  27ra2  I  r2a  1  dr 


=  7 TOL. 


L 


a\2 


wa  ||2  =  /  (1  —  ra)  2nr  dr 


7T& 


2  H-  3ct  H-  a2 


Hence  the  Rayleigh  quotient  gives  the  bound 


X\  £  —  —  -f  3  a 

a 


for  a  >  0.  The  optimal  choice  is  a  =  y/2,  which  gives 


\\  <  3  +  2V2  =  5.828.  (11.54) 

Compare  this  to  the  exact  value  computed  in  Example  5.5  in  terms  of  zeros  of  the 
Bessel  /-function, 

Ai  =  j2 ,  =  5.783.  (11.55) 

The  optimal  choice  of  wa  gives  a  reasonable  approximation  to  the  true  eigenfunction, 
as  Fig.  11.2  demonstrates.  0 


Fig.  11.2  The  Bessel 
function  for  the  first 
eigenfunction  and  w  ^ 
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We  can  estimate  higher  eigenvalues  and  improve  the  accuracy  by  using  a  larger 
subspace.  This  computational  strategy  was  introduced  by  Walter  Ritz  in  1909,  and 
is  referred  to  as  the  Rayleigh-Ritz  method .  Given  a  finite  dimensional  subspace  A  C 
we  let  A k(A)  denote  the  ^-dimensional  subspaces  of  A.  The  approximate 
eigenvalues  associated  to  A  are  then  given  by 


rjk  :=  min 

WeAk(A ) 


•  max  7 Z[u]  , 

mgW\{0} 


(11.56) 


for  k  —  1,  . . . ,  dim  A. 

Since  A  is  finite  dimensional,  the  calculation  of  (1 1.56)  can  be  recast  as  a  matrix 
eigenvalue  problem.  By  the  same  arguments  used  in  the  proof  of  Theorem  11.7, 
the  values  %  are  associated  to  vectors  iq  £  A  satisfying  the  approximate  weak 
eigenvalue  equation, 

£[vk,  w]  =  rjk  (vk,  w)  (11.57) 


for  all  w  e  A. 

To  interpret  (1 1.57)  as  a  matrix  eigenvalue  equation  we  fix  a  basis  {wjYJ=l  for  A. 
In  terms  of  this  basis,  the  energy  functional  and  L 2  inner  product  define  matrices 

Eij  :=  £[wi,  wj ],  Ftj  ■—  ( Wi ,  wj) .  (1 1.58) 


If  Vk  is  expanded  as 

m 

vk  =  £w 

j=i 


then  (11.57)  is  equivalent  to 


J2ci(Eij-mFij)=  0  (11.59) 

1  =  1 

for  j  =  l, ...  ,m.  This  equation  has  a  nontrivial  solution  only  if  the  rows  of  the 
matrix  E  —  r]kF  are  linearly  dependent,  which  is  equivalent  to  the  vanishing  of  the 
determinant.  The  values  r\k  can  thus  be  calculated  as  the  roots  of  a  polynomial 

{771, . . . ,  r]k\  =  {77  :  det(£  -  77F)  =  0} .  (11.60) 

In  other  words,  the  r]j  are  the  eigenvalues  of  EF~l . 

Example  11.14  Consider  D  as  in  Example  11.13,  but  now  take  the  subspace  A  = 
[uq,  W2,  uq],  where 


Wj(r)  :=  1  —  r2j. 
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Straightforward  computations  give  the  matrices  (11.58)  as 


/  7T  57T  971  \ 

3  12  20 

5tt  87t  Itt 
12  15  12 

V9tt  In  9n  , 
20  12  4  / 


rj\  =  5.783,  7^2  =  30.712,  773  =  113.505. 


The  estimate  rj\  matches  the  exact  value  (11.55)  very  closely;  in  fact, 


|Ai  —  7]\  |  SS  10“6. 


The  second  value  772  is  a  reasonable  approximation  to 


As  =  Jo,  2  =  30.471. 


However,  we  missed  the  second  eigenvalue 


A2  =  JI  ,  =  14.682. 

The  problem  is  that  the  space  A  consists  entirely  of  radial  functions,  so  that  the 
second  eigenfunction, 

<h (r,  0)  =  , 


2tt  ^  3tt\ 

E  = 

8?r  4-7T  247r 

3  5 

(3tt  24/  6n) 

The  roots  of  det(E  — 

rjF)  are 

is  orthogonal  to  A.  0 

The  missing  eigenvalue  in  Example  11.14  illustrates  a  potential  flaw  in  the 
Rayleigh-Ritz  scheme.  We  need  to  make  sure  the  subspace  A  covers  H^{Q)  suf¬ 
ficiently  well  in  order  to  catch  all  low-lying  eigenvalues.  At  the  same  time,  we  also 
need  a  means  of  producing  this  subspace  efficiently.  Th e  finite  element  method  is  an 
approach  that  addresses  both  of  these  concerns. 

To  set  up  the  finite  element  method  we  subdivide  Q  into  small  polygonal  domains, 
producing  a  mesh  (or  triangulation).  Figure  11.3  illustrates  a  mesh  for  a  two- 
dimensional  domain.  To  each  interior  vertex  is  associated  a  piecewise  linear  function 
called  an  “element”  that  is  positive  at  the  vertex  and  decays  linearly  to  zero  on  the 
neighboring  faces.  The  span  of  these  elements  defines  a  subspace  A  with  dimension 
equal  to  the  number  of  interior  vertices. 

Example  11.15  Let  us  consider  the  domain  (0, 7 r),  for  which  the  exact  Dirichlet 
eigenfunctions  are  (j)k(x)  =  sinkx  for  k  e  N,  with  =  k2.  Define  a  mesh  by 
subdividing  (0,  7 r)  into  m  +  1  intervals  of  length  7 r/(m  +  1).  To  the  jth  vertex  we 
associate  the  element 
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Fig.  11.3  Discrete  mesh  for 
an  oval  domain  in  R2 


Fig.  11.4  Piecewise  linear 
elements  for  (0,  tt)  with 
m  =  3 


TT 


UJ  j(  x  )  =  < 


7  —  1 

x~i+i7r’ 

—X  +  ^T7  7T, 

772  +  1  ’ 


7-1 


7 T  <  X  < 


777  +  1  —  —  777  + 1 


TT, 


TT  <  X  <  PttTT, 


o, 


772  +  1  —  '  —  777  +  1 

otherwise, 


for  j  =  1,  . . .  m.  These  elements  are  illustrated  in  Fig.  11.4. 

The  energy  and  inner  product  matrices  can  be  computed  by  straightforward  inte¬ 
grals, 


i  =  h 
\i~j\  =  h 

otherwise, 


27 r3 

3(777  +  l)3 
7T3 

6(777  +  l)3 

0 


i  =  j , 
\i~j\  =  1, 

otherwise. 


Table  11.1  shows  the  resulting  approximate  eigenvalues  as  a  function  of  the  number 
of  elements  m.  0 

The  finite  element  method  produces  approximate  eigenfunctions  as  well  as  eigen¬ 
values.  Once  the  eigenvalue  r]k  is  determined,  the  coefficients  Cj  can  be  computed 
from  (11.59).  These  coefficients  represent  the  eigenvectors  of  ( EF~l)T  in  the  basis 
[wj].  The  function  Vk  =  Y^'j=\  cjwj  is  approximation  within  the  subspace  A 
to  the  true  eigenfunction  <pk.  Figure  11.5  shows  some  approximate  eigenfunctions 
determined  by  the  calculations  in  Example  11.15. 

The  Rayleigh-Ritz  strategy  proves  to  be  quite  adaptable  to  more  complicated 
problems.  The  procedure  consists  of:  (1)  generating  a  mesh  for  a  given  domain, 
(2)  computing  the  matrix  entries  Etj  and  Ftj  corresponding  to  the  elements  of  the 
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Table  11.1  Approximate  Dirichlet  eigenvalues  for  (0, 7r)  computed  using  m  elements 


m 

3 

10 

25 

50 

exact 

m 

1.05 

1.01 

1.00 

1.00 

1 

m 

4.86 

4.11 

4.02 

4.01 

4 

1]3 

12.  84 

9.56 

9.10 

9.03 

9 

114 

— 

17.80 

16.31 

16.08 

16 

V5 

— 

29.45 

25.77 

25.20 

25 

Fig.  11.5  Piecewise  linear  approximations  to  the  eigenfunction  cf>s(x)  =  cos  3x 


Fig.  11.6  A  approximation 
of  the  eigenfunction  <j) 6  for  a 
star- shaped  domain 


mesh,  and  (3)  solving  a  finite-dimensional  eigenvalue  problem.  A  two-dimensional 
example  is  shown  in  Fig.  11.6. 


11.8  Euler-Lagrange  Equations 

In  the  mid- 18th  century,  Lagrange  and  Euler  jointly  developed  a  framework  for 
expressing  problems  in  classical  mechanics  in  terms  of  the  minimization  of  an  action 
functional.  Euler  coined  the  term  calculus  of  variations  to  describe  this  approach, 
which  proved  adaptable  to  a  great  variety  of  problems. 

In  the  original  classical  mechanics  setting,  the  action  functional  was  the  integral 
of  a  Lagrangian  function,  defined  as  kinetic  energy  minus  potential  energy.  These 
might  be  energies  of  a  single  particle  or  a  system. 
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In  a  typical  PDE  application  on  a  bounded  domain  Q  c  R",  we  take  the 
Lagrangian  L(/?,u;,x)tobea  smooth  function 

L:fxlx^i 


The  action  functional  is  defined  by 


S[w] 


L(Vu;(x),  w(x),  x )  dnx 


for  w  e  Cl(C2;  R).  Suppose  that  u  is  a  critical  point  of  S,  in  the  sense  that 

d 

—  S[u  +  tip]  =  0 

dt  t= o 

for  every  ip  e  This  implies  a  (possibly  nonlinear)  PDE  for  u,  called  the 

Euler-Lagrange  equation  of  L. 

The  Dirichlet  principle  gives  the  most  basic  example  of  this  setup.  For 


L(p,w,x)  :=  -y-  (11.61) 

the  action  functional  is  the  Dirichlet  energy  £[w],  and  the  Euler-Lagrange  equation  is 
the  Laplace  equation.  To  formulate  the  Poisson  equation,  we  modify  the  Lagrangian 
to  include  the  forcing  term  /  e  L2(^2;  R), 


L(p,w,x):= -E - fw.  (11.62) 

In  this  case  the  action  is  the  functional  Vf[w]  defined  in  (11.6). 

A  classic  nonlinear  example  is  the  surface  area  minimization  problem.  For  Q  e 
M2,  the  graph  of  a  function  w  :  £2  — >  R  defines  a  surface  in  M3.  According  to  (2.8), 
the  surface  area  of  this  patch  of  surface  is  given  by 

A[w]  :=  f  pi  +  |Vw|2  d2x. 

Jn  V 

We  can  interpret  this  as  an  action  functional  corresponding  to  the  Lagrangian 


L(p,  w,x) 


2 


For  /  :  dQ  —>  R,  the  problem  of  minimizing  A[w]  under  the  constraint  w\q q  —  f 
was  first  studied  by  Lagrange.  But  historically  this  is  called  the  Plateau  problem 
after  the  19th  century  physicist  Joseph  Plateau  who  conduct  experiments  on  minimal 
surfaces  using  soap  films. 


236 


1 1  Variational  Methods 


Let  us  work  out  the  Euler-Lagrange  equation  for  the  surface  area  functional.  For 
ip  e  C^t(f2),  we  have 


d 

dt 


A[u  +  t!p] 


d 


t= o  dt 


I  +  |Vw  +  tv^\2  d2x 

J  C2 


t= 0 


/ 


Vu-V'lp  2 
d  x 


ft  y  i  -j- 1  v  w  i2 


By  Green’s  first  identity  (Theorem  2. 10),  and  the  fact  that  ip  vanishes  near  the  bound¬ 
ary, 


d 

dt 


A[u  +  tpj\ 


=  f  ipV  .  , _ 

?=0  Jn  \  VT  +  | Vw|2 


Vt/ 


r/2x 


Setting  this  equal  to  zero  for  all  ip  gives  the  Euler-Lagrange  equation 


/  Vu  \ 

V  7  =0.  (11.63) 

\Vi  +  |Vw|2/ 

This  nonlinear  PDE  is  called  the  minimal  surface  equation. 

There  is  a  well-developed  existence  and  regularity  theory  for  general  Euler- 
Lagrange  equations,  but  this  is  too  technical  for  us  to  go  into  here.  In  many  cases,  the 
finite  element  method  can  be  used  to  effectively  reduce  the  numerical  approximation 
of  solutions  to  linear  algebra.  Figure  11.7  shows  a  solution  of  the  minimal  surface 
equation  calculated  using  finite  elements. 


Fig.  11.7  The  minimal 
surface  over  the  unit  disk 
associated  to  the  boundary 
function  f{6)  =  cos  6 6 
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11.9  Exercises 

11.1  For  a  finite-dimensional  subspace  A  C  Hq(£2),  suppose  we  approximate  the 
solution  of  the  Poisson  equation  (11.4)  for  /  e  L2(£2)  by  setting 

u  =  min  Vf[w]. 

weA 


Given  a  basis  w\,  . . .  wm  for  A  (not  necessarily  orthonormal),  set  u  =  Y77=\  G  wi  • 
Find  equation  an  equation  for  {c\ ,  . . . ,  cm)  in  terms  of  /  and  the  matrices  E  and  F 
defined  in  (11.58). 

11.2  To  demonstrate  the  role  that  ellipticity  plays  in  Theorem  11.5,  consider  the 
operator  L  =  r2A  on  the  unit  ball  B  =  {r  <  1}  C  M3,  where  r  :=  |x|.  For 
/  e  L2( B),  a  weak  solution  of  the  equation  Lu  =  f  with  Dirichlet  boundary 
conditions  is  defined  as  a  function  u  e  //(]  (B)  satisfying 

Vu  •  V(r2^)  +  frp\  d2x  =  0  (11.64) 


for  all  ^  e  C^t(B). 

(a)  Compute  the  weak  partial  derivatives  of  the  function  log  r  and  show  that  log  r  e 

(b)  Show  that  u(x)  =  logr  satisfies  (11.64)  with  /  =  1.  Note  that  even  though  dM 
and  /  are  C°°,  the  solution  u  is  not  even  H 2 .  (It  is  not  a  coincidence  that  the 
singularity  of  u  occurs  at  the  point  where  ellipticity  of  the  operator  fails.) 

11.3  Let  Q  c  M2  be  the  equilateral  triangle  with  vertices  (0,  0),  (2,  0),  and  (1 ,  x/3). 
Define  w  e  //(|  (£2)  to  be  the  piecewise  linear  function  whose  graph  forms  a  tetrahe¬ 
dron  over  Q,  with  the  top  vertex  at  (1,  1/V3,  l/\/3).  Approximate  the  first  eigen¬ 
value  by  computing  the  Rayleigh  quotient  7 Z[w].  (For  comparison,  the  exact  value 
is  Ai  =  this  corresponds  to  the  eigenfunction  shown  in  Fig.  11.1.) 

11.4  Let  B3  denote  the  unit  ball  {r  <  1}  C  M3.  Find  an  upper  bound  on  the  first 
eigenvalue  by  computing  the  Rayleigh  quotient  of  the  radial  function  w(r)  =  1  —  r. 
(Compare  your  answer  to  the  exact  value  Ai  =  7r2  found  in  Exercise  5.8.) 

11.5  For  a  bounded  domain  C  Mw,  let  cj)\  E  H(]  (  R)  D  C00^)  be  the  first 
eigenfunction  as  obtained  in  Theorem  11.9,  normalized  so  that  ||0i||2  =  L  In  this 
problem  we  will  show  that  (j)\  has  no  zeros  in  Q  and  that  this  eigenfunction  is  unique 
up  to  a  multiplicative  constant. 


(a)  Set 


4>±(x)  :=  max  {zb(/>i(x),  0} 
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for  x  g  F2,  and  show  that 


4>\  =  <t>+  -  <t>- 


with  0+  >  0  and  0+0_  =  0. 

(b)  Note  Lemma  10.10  implies  that  0+  g  H^(F2\  R).  Assuming  that  0 1  is  normal¬ 
ized  ||0i  || 2  =  1,  show  that 


ii+ii2  +  ii+ii2=  i, 

£[+]  +  £[<+]  =  A,. 

(c)  Since  Ai  minimizes  the  Rayleigh  quotient, 

£[0±]>  A0I0+H2. 

Use  this,  together  with  the  fact  that  £[0i]  =  Ai,  to  deduce  that 

£[0±]  =  Ai||0±||2. 

Hence,  by  the  proof  of  Theorem  11.9,  0+  G  C°°(^2)  and 


—  A0+  =  Ai0±. 


(11.65) 


(d)  Use  the  strong  maximum  principle  of  Theorem  9.5  to  deduce  from  (1 1.65)  that 
if  0+  has  a  zero  within  Q  then  0+  =  0.  Conclude  that  0i  has  no  zeros  in  F2 . 

(e)  If  u  e  H^(f2;  R)  D  C°°(^2)  is  some  other  eigenfunction  with  eigenvalue  Ai, 
then  u  —  c0i  is  also  an  eigenfunction  for  each  cgR.  Show  that  c  can  be  chosen 
so  that  u  —  c0i  has  a  zero  in  £2 .  Conclude  that  u  =  c0i. 


11.6  Determine  the  Euler-Lagrange  equations  on  F2  C  R”  corresponding  to  the 
following  Lagrangians: 


(a)  L(p,  w,x)  = 

(b)  L(p,  w,  x)  = 

(c)  L(p,  w,x)  = 

(d)  L(p,  w,  x)  = 


1  ,2  1 


_  I p\  H — a(pc)w  for  a  G  C^(F2). 

2  2' 

1  n 

-  22  aij(x)pipj  with dij  e  C\Q). 


i,j  =  1 


IP 

1 

2 


1  2 

-  | p |2  +  F(w)  for  F  g  C^R). 


Chapter  12 

Distributions 


To  define  weak  derivatives  in  Chap.  10,  we  measured  the  values  of  a  function  /  e 
L\0C{Q)  by  integrating  against  test  functions.  One  way  to  interpret  this  process  is 
that  /  defines  a  functional  >  C  given  by 


i->  f  fip  dnx. 

J  £2  “ 

A  distribution  on  Q  C  is  a  more  general  functional  C^t(E2)  — >  C,  not  necessarily 
expressible  as  an  integral.  To  qualify  as  a  distribution,  a  functional  is  required  to 
satisfy  conditions  that  insure  that  weak  derivatives  and  other  basic  operations  are 
well  defined. 

As  with  weak  derivatives,  the  concept  of  a  distribution  was  inspired  by  idealized 
situations  in  physics.  Indeed,  the  term  “distribution”  was  inspired  by  charge  distrib¬ 
utions  in  electrostatics,  an  example  that  we  will  discuss  in  Sect.  12.1.  Distributions 
generalize  the  notion  of  weak  solutions,  in  the  sense  that  every  function  in  L\0C(Q) 
also  defines  a  distribution.  The  trade-off  for  the  increased  generality  is  that  some 
basic  operations  for  functions  cannot  be  applied  to  distributions.  The  product  of  two 
distributions  is  not  generally  well  defined,  for  example. 

There  are  some  technicalities  in  the  mathematical  theory  of  distributions  that 
require  more  background  on  the  topology  of  function  spaces  than  we  assume  for  this 
text.  We  will  treat  these  technicalities  rather  lightly;  our  focus  will  be  on  exploring 
the  PDE  applications. 


12.1  Model  Problem:  Coulomb’s  Law 

Coulomb  ’s  law  of  electrostatics  is  an  empirical  observation  developed  by  1 8th  century 
physicist  Charles-Augustin  de  Coulomb.  It  says  that  a  particle  with  electric  charge 
go,  located  at  the  origin,  generates  an  electric  field  given  by 
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E(x)  = 


kqox 


(12.1) 


where  k  (Coulomb’s  constant)  depends  on  the  properties  of  the  medium  surrounding 
the  charges. 

In  Sect.  11.1  we  discussed  another  important  empirical  law  of  electrostatics, 
Gauss’s  law.  With  the  same  convention  for  physical  constants  as  in  (12.1),  the  dif¬ 
ferential  form  of  the  law  says  that 


V  •  E  =  4nkp, 


(12.2) 


where  p  is  the  charge  per  unit  volume  as  a  function  of  position. 

These  two  empirical  laws  present  something  of  a  mathematical  conundrum,  in 
that  the  field  specified  by  Coulomb  is  not  differentiable  at  x  =  0,  not  even  weakly. 
On  the  other  hand,  for  x  ^  0, 


V  • 


V  •  x  3x 


3  3x  x 


y  3  y  4  y 


0 


(12.3) 


This  is  consistent  with  (12.2),  in  that  Coulomb  assumes  the  charge  density  is  zero 
for  x  7^  0.  However,  if  a  function  in  L\oc  vanishes  except  at  a  single  point,  then 
that  function  is  zero  by  the  equivalence  (7.6).  Thus  a  point  charge  density  has  no 
meaningful  interpretation  as  a  locally  integrable  function. 

To  reconcile  (12.1)  with  Gauss’s  law,  let  us  consider  the  weak  form  of  (12.2), 


E  •  V0  d3x  =  —47 xk 


(12.4) 


for  all  0  g  C^t(M3).  The  left  side  of  (12.4)  is  well  defined  because  the  components 
of  E  are  locally  integrable. 

Since  the  Coulomb  field  is  smooth  away  from  the  origin,  we  can  integrate  by  parts 
as  long  as  we  exclude  the  origin  from  the  region  of  integration  by  writing  the  integral 
as  a  limit, 

/  Zi  •  V0  <Z3x  =  lim  I  E  •  V0  d3x.  (12.5) 

J  R3  £^°J{r>£} 

The  region  {r  >  s}  has  boundary  given  by  the  sphere  {r  =  e}.  In  this  case  the 
“outward”  unit  normal  is  a  radial  unit  vector  pointing  towards  the  origin, 

(12.6) 


By  the  divergence  theorem  (Theorem  2.6),  and  the  fact  that  V  •  E  =  0  for  r  >  0  by 
(12.3), 
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.  d3x  = 


v  •  E  'ip  dS. 


Hence  by  (12.1)  and  (12.6), 


V'lp  d3x 


Taking  e 


0  now  gives 


E  •  Wip  d3x  =  —  lim 


(12.7) 


Because  ip  is  continuous,  the  average  of  f  over  the  sphere  {r  =  e]  approaches  ^(0) 
as  e  ->  0,  i.e., 


1 

lim  7 - 9 

4tts2 


ip  dS  =  ip(0). 


{r=e} 


Applying  this  to  (12.7)  gives 


/  •  V^>  d3x  =  — 4tt%)V'(0).  (12.8) 

J R3  ri 


The  weak  condition  (12.4)  thus  requires  that 


/  pf  d3x  =  q0f(0), 

J  R3 


for  every  ip  e  C^t(M3).  This  is  consistent  with  the  physical  interpretation  of  p  as  a 
charge  located  exactly  at  the  origin. 

The  concept  of  a  “point  density”  was  widely  used  in  physics  applications  in  the 
18th  and  19th  centuries.  In  a  1930  book  on  quantum  mechanics,  the  physicist  Paul 
Dirac  described  such  densities  in  terms  of  a  delta  function  5(jc),  whose  defining 
property  is  that 


/ (x)S(x)  dnx  :=/( 0), 


(12.9) 


for  a  continuous  function  /.  This  terminology  and  notation  are  potentially  mislead¬ 
ing,  because  S  is  not  a  function  and  (12.9)  is  not  actually  an  integral.  However,  Dirac’s 
formulation  hints  at  the  proper  mathematical  interpretation,  which  is  that  S  should 
be  understood  as  a  functional  /  i->  /( 0). 

If  we  accept  the  intuitive  definition  of  the  delta  function  for  the  moment,  then  we 
can  interpret  the  calculation  (12.8)  as  showing  that 

V  •  ^  =  4t t6. 
r 3 


(12.10) 
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12.2  The  Space  of  Distributions 

A  distribution  on  a  domain  Q  C  R”  is  a  continuous  linear  functional  C™t(£2)  — >  C. 
The  map  defined  by  a  distribution  w  is  usually  written  as  a  pairing  of  w  with  a  test 
function,  i.e., 

0^(w,0)gC  (12.11) 

for  0  g  C^t(£?).  Linearity  means  that 

(m,  <U01  +  c202)  =  ci(/,  0i)  +  c2(/,  02), 
for  all  c\,  c2  g  C  and  0i,  02  e  C^t(£?). 

The  definition  of  distribution  also  includes  the  word  “continuous”.  To  define  con¬ 
tinuity  for  functionals  we  must  first  specify  what  convergence  means  in  C™t  (£2) .  The 
standard  definition  is  that  for  a  sequence  {0^}  to  converge  to  0  in  C^{Q)  means  that 
all  0£  have  support  in  some  fixed  compact  set  K  c  £2,  and  the  sequence  of  functions 
and  all  sequences  of  partial  derivatives  converge  uniformly  on  K.  Continuity  of  the 
functional  (12.11)  is  then  defined  by  the  condition  that  convergence  of  a  sequence 
0£  — >  0  in  C^t(£?)  implies  that 

lim  (u,  0fc)  =  (w,  0).  (12.12) 

& — >  oo 

In  finite  dimensions  continuity  is  implied  by  linearity.  That  is  not  the  case  here,  but 
in  practice  it  is  quite  difficult  to  come  up  with  a  functional  that  is  linear  but  not 
continuous. 

The  set  of  distributions  on  £2  forms  a  vector  space  denoted  by  Linear 

combinations  of  distributions  are  defined  in  the  obvious  way  by 


(CiUi  +  C2u2,  0)  :=  Cl  (Ml,  0)  +  C2(u2,  0), 


for  mi,  u2  G  V'(£2)  and  c\,c2  G  C.  The  mathematical  theory  of  distributions  was 
developed  independently  in  the  mid-20th  century  by  Sergei  Sobolev  and  Laurent 
Schwartz.  Schwartz  used  V  as  a  notation  for  C^t,  and  the  prime  accent  on  V'  comes 
from  the  notation  for  the  dual  of  a  vector  space  in  linear  algebra. 

A  locally  integrable  function  /  g  L\oc(£2)  defines  a  distribution  through  the 
integral  pairing 

(/>  0)  [  f  Tp  dnx .  (12.13) 

J  Q 

Under  this  convention  there  is  an  inclusion 

Ll.(S2)  C  V'(S2). 

In  particular,  all  Lp  functions  can  be  interpreted  as  distributions. 
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As  we  saw  with  the  point  charge  density  in  Sect.  12.1,  not  all  distributions  are 
given  by  functions.  We  use  the  notation  Sx  for  the  delta  function  centered  at  x  e  £2, 
defined  by 

tfx.VO  :=V>(*).  (12.14) 

By  convention  the  subscript  is  dropped  for  x  =  0,  i.e.,  S  :=  So. 

Multiplication  by  smooth  functions  preserves  the  space  Therefore  it 

makes  sense  to  multiply  a  distribution  u  e  T>'{Q)  by  a  function  /  e  C°°  (<£?).  The 
product  distribution  is  defined  by 

(/«,  VO  ■=  (u,  /</>). 

It  does  not  make  sense,  however,  to  multiply  two  distributions  together.  This  fact  was 
intuitively  clear  in  early  applications:  the  product  of  two  charge  densities  makes  no 
physical  sense. 

Convergence  of  a  sequence  of  distributions  is  defined  in  a  very  straightforward 
way.  We  say  that  Uk  — >  u  in  V'{Q)  if 

lim  (uk,  VO  =  (u,  2p) 

k — >oo 

for  all  V>  6  C^t  (  f2  ).  All  distributions  can  in  fact  be  approximated  by  smooth  functions 
by  such  a  limit,  although  we  are  not  equipped  to  prove  that  here.  We  will  present 
one  useful  special  case,  a  construction  of  the  delta  function  as  a  limit  of  integrable 
functions. 

Lemma  12.1  Given  f  e  satisfying 


(12.15) 


define  the  rescaled  function, 


fora  >  0.  Then 


fa(x)  :=  a”  f  (ax ) 


lim  fa  =  S,  (12.16) 

a — >oo 

as  a  distributional  limit. 

Proof  For  V>  e  C™t  (M72 )  we  can  evaluate  the  pairing  with  fa  using  a  change  variables, 


(fa.  Ip) 


an  f(ax)f(x)  dnx 


f(x)f>(x /a)  dnx. 
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By  the  assumption  (12.15),  we  can  also  write 


^(0)=  /  f(x)^(0)dnx, 


which  gives  the  estimate 


(fa,4 0-^(0) 


< 


\f(x)\  ip(x/a)  —  -0(0) I  dnx 


(12.17) 


Given  e  >  0,  the  fact  that  /  is  integrable  implies  that  exists  R  sufficiently  large 
so  that 


|/|  dnx  <  e. 


(12.18) 


\x\>R 


By  the  continuity  of  /  we  can  also  choose  5  >  0  so  that 

IV’OO  -  /(0)|  <  £ 

for  |x|  <  5.  For  a  >  R/S  this  implies  that 


ip{x /a)  —  '/(O) 


<  £ 


(12.19) 


for  all  |x|  <  R.  Using  (12.18)  and  (12.19)  to  estimate  the  difference  (12.17)  gives 


0)|  <2|Moo  /  \f(x)\dnX+S  /  |/(X)|JWX 

<  (2WIIco  +  II/IIi)£, 

for  a  >  /?/<5.  Since  e  was  arbitrary,  this  shows  that 


lim  (fa,  ip)  =  ip( 0). 


a — >oo 


□ 

The  rescaling  used  in  Lemma  12.1  is  illustrated  in  Fig.  12.1.  Note  that  this  looks 
very  similar  to  Fig.  9.1,  and  in  fact  the  proof  of  Lemma  12.1  uses  essentially  the 
same  argument  as  that  of  Theorem  9.1.  We  saw  another  case  of  this  construction  in 
the  proof  of  Theorem  6.2.  Indeed,  we  can  now  interpret  the  result  of  Theorem  6.2  as 
a  distributional  limit  of  the  heat  kernel, 


lim  Ht  =  S, 

t-»  o 


where  Ht  was  defined  in  (6.16). 
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Fig.  12.1  Rescaled 
functions  fa  for 
f(x)  = 


2 


12.3  Distributional  Derivatives 

The  distributional  derivative  extends  the  concept  of  the  weak  derivative  introduced 
in  Sect.  10.1.  By  analogy  with  (10.7),  for  u  e  and  we  define  the  distribution 

Dau  by 

(. Dau ,  VO  :=  (-1  Da VO,  (12.20) 


with 


and  \a\  :=  a\-\ - bo^,  as  before.  The  pairing  (12.20)  is  well  defined  as  a  distribution 

because  Da  is  both  linear  and  continuous  as  a  map  C“  (fl)  ->  C~  (£2). 

The  terms  “distributional”  and  “weak”  are  frequently  used  interchangeably  to 
describe  derivatives,  since  the  definitions  overlap  to  a  considerable  extent.  The  only 
difference  is  that  a  weak  derivative  is  representable  as  a  locally  integrable  function. 
Weak  derivatives  may  not  exist,  whereas  all  distributions  are  infinitely  differentiable. 

Example  12.2  Let  us  reconsider  Example  10.3,  where  we  considered  the  derivative 
of  w  g  L11oc(1f2)  defined  by 


w(t)  = 


W-(t),  t  <  0, 
w+(t),  t  >  0, 


where  w±  e  C^M).  As  part  of  that  calculation  we  showed  that 


/oo  /»oo 

wi/j'  dt  ~  [iu+(0)  —  W-( 0)]t/^(0)  +  /  dt , 

-OO  7—00 


(12.21) 


where  h  is  the  piecewise  derivative 
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h{t) 


t  <  0, 
w'+(t),  t  >  0. 


The  left-hand  side  of  (12.21)  is  the  pairing  (u/ ,  t/0  by  the  definition  (12.20).  From 
the  right-hand  side  we  can  thus  see  that  the  distributional  derivative  is 

w'  =  h  +  [u;+(0)  —  u;_(0)](5. 


0 

Example  12.3  For  Sx  e  V'(Rn),  the  derivatives  DaSx  are  easily  computed  from  the 
definition  (12.20).  For  ip  e  C^t(R"), 

=  (— l)|a|  Daip(x). 


In  other  words,  the  distribution  DaSx  evaluates  the  derivative  of  the  test  function  at 
the  point  x,  up  to  a  sign.  0 


Example  12.4  The  function  In  |x|  is  locally  integrable  on  R  and  so  defines  a  distri¬ 
bution  in  Vr (R).  Therefore  (In  |x|)'  exists  in  the  distribution  sense.  This  is  puzzling 
because 


d 

dx 


In  \x 


1 

x 


for  x  0,  and  x_1  is  not  locally  integrable. 

To  understand  what  is  happening  here,  we  must  return  to  the  distributional  defin¬ 
ition, 

((In  \x\)',  ip)  :=  -(In  |x|,  ip’) 


/OO 

y/( x )  In  |x|  dx 

-oo 


for  -0  e  C™t  (R) .  To  compute  this  we  avoid  the  singularity  at  0  by  writing 


((In  |x|)',  0)  =  —  lim  /  0'(x)  In  |x|  dx. 

J\x\>£ 


(12.22) 


Integration  by  parts  gives 


■  — s 


0'(x)  In  |x|  dx  =  —  0(x)  In  |x 


-oo 


— £ 


— OO 


+ 


— £ 


-oo 


ip(x) 


dx 


x 


xp(x) 


dx, 


0(— £)  Ins  + 
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and  similarly 


•oo 


•oo 


'ip'ix)  In  \x  |  dx  =  In  e  + 


dx, 


x 


After  combining  these  two  halves,  we  obtain 

/  ?//(x)  In  \x\  dx  =  £)]  lne1  +  / 

J\x\>£  J  \x\>£ 


ip(x) 


dx 


x 


By  the  definition  of  the  derivative, 


lim  *ie>  -  *<-e)  =  no ) 

£^•0  2  s 


Therefore 


lim  —  'ip (—€)']  \ne 


2ib'(0)  lim  sine 

c^O 


=  0. 


Hence  (12.22)  reduces  to 

((In  1*1)',  ip)  =  lim  [  ^  ^  dx.  (12.23) 

\x\>£  X 

The  limit  on  the  right  exists  for  ip  e  C^t(R),  even  though  v_1  is  not  integrable, 
because  the  limit  is  taken  symmetrically.  This  limiting  procedure  defines  a  distribu¬ 
tion  called  the  principal  value  of  v_1 ,  written  as  PV[v-1].  We  could  rephrase  (12.23) 
as 

—  In  |jc|  =  PV  [x-M  . 

dx  L  J 


0 

Example  12.5  Let  us  reinterpret  the  discussion  from  Sect.  12.1  in  terms  of  distrib¬ 
utional  derivatives.  We  already  noted  that  the  components  of  x/r3  are  locally  inte¬ 
grable,  so  we  can  consider  the  Coulomb  formula  (12.1)  for  E  as  the  definition  of  a 
vector- valued  distribution.  The  distributional  divergence  of  x/r3  is  defined  by  the 
condition  that 

(v-4^)  :=-  [  d3x , 

V  r3  /  JR 3  r5 

for  pj  e  C^t(M3).  The  derivation  of  (12.8)  thus  shows  that 

V  •  ^  =  4t t5. 
r 3 


(12.24) 
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We  can  also  consider  the  corresponding  result  for  the  Coulomb  electric  potential 


1 

4>(x)  = 

r 


(ignoring  the  physical  constants).  The  gradient  of  0  exists  in  the  weak  sense  and  is 
given  by 


Since  A  =  V  •  V,  we  deduce  from  (12.24)  that 


-  A 


(12.25) 

0 


12.4  Fundamental  Solutions 


Because  the  Poisson  equation  is  linear,  it  makes  sense  to  construct  a  solution  with 
a  continuous  density  by  superimposing  a  field  of  point  sources.  With  a  change  of 
variables,  we  can  see  from  (12.25)  that  the  potential  function  corresponding  to  a 
point  source  at  y  e  M3  is 


1 


Weighting  the  point  sources  by  the  density  p  and  summing  them  with  an  integral 
gives 

1  f  p(y)  o 

u(x)  =  —  /  d3y .  (12.26) 

4tt  Jr 3  \x  -  y  | 


This  formula,  which  is  often  stated  as  the  integral  form  of  Coulomb’s  law,  does 
indeed  yield  a  solution  of  the  Poisson  equation  on  M3  under  certain  conditions.  For 
example  if  p  e  Cc'pt  (M3)  then  one  can  confirm  that  —  A u  =  p  by  direct  computation. 

The  C1  condition  is  stronger  than  necessary  here,  but  continuity  alone  would  not 
be  sufficient.  (The  precise  notion  of  regularity  needed  for  this  problem  is  something 
called  Holder  continuity.) 

This  idea  of  constructing  of  general  solutions  by  superposition  of  point  sources  is 
the  inspiration  for  the  concept  of  a  fundamental  solution.  For  a  constant-coefficient 
differential  operator  L  acting  on  W1 ,  of  the  form 
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with  aa  e  C,  a  fundamental  solution  is  a  distribution  0  e  V'(Rn)  such  that 

L0  =  5 .  (12.27) 

For  example,  in  the  Coulomb  case  the  calculation  (12.25)  gives  the  fundamental 
solution  of  —  A  on  R3.  Fundamental  solutions  are  especially  important  for  classical 
problems  involving  the  Laplacian. 

The  solution  formula  (12.26)  resembles  the  convolution  used  to  solve  the  heat 
equation  in  Sect.  6.3.  For  f,ge  L1  (M77)  the  convolution  is  defined  as 


/  *  g(x) 


f(y)g(x  -  y )  dny. 


A  simple  change  of  variables  shows  that  this  product  is  symmetric, 


/  *  g  =  g  *  /• 


In  order  to  produce  solution  formulas  from  fundamental  solutions,  we  need  to  under¬ 
stand  how  to  take  convolutions  with  distributions. 

For  f,ge  Ll(Rn),  the  distributional  pairing  of  f  *  g  with  ip  e  C^t(C2)  gives 

(/  *  g,  ip)  =  [  [  f(y)g(x  -  y)i>(x)  dny  dnx.  (12.28) 

J  Rn  J  Rn 

The  x  integration  looks  almost  like  the  convolution  of  ip  with  g,  except  with  the 
argument  switched  from  y  —  x  to  x  —  y.  With  the  reflection  defined  by 

g~(x)  :=  g(-x). 


we  have 


g  *  ^(y)  = 


g(x  —  y)tp(x)  dnx. 


Thus  (12.28)  reduces  to 

(f  *g,ip)  :=  (f,g~  *V0- 

If  (p,  pj  e  C^t(M77),  then  it  is  easy  to  check  that  <p~  *  ip  e  C^t(R77)  also.  Moreover, 
the  map  pj  i->  p~  *p)  is  linear  and  continuous.  We  can  thus  define  u  *  </>  fort/  e  V'(Rn) 
and  p  e  C^t(R77)  by 

(u  *  (p,  pj)  :=  ( u ,  <p~  *  pj)  (12.29) 


for  pj  e  C™(Rn). 

The  distribution  <5  plays  a  special  role  with  regard  to  convolutions.  By  the  definition 
(12.29), 
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=  p~  *  p(0) 
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=  /  p(x)p(x)  dnX 
J Q 

=  (0,  VO- 


This  shows  that 

5*p  =  p.  (12.30) 


In  other  words,  convolution  by  S  is  the  identity  map. 

Let  0  be  the  fundamental  solution  for  the  constant  coefficient  operator  L.  Our 
goal  is  to  show  that  the  equation  Lu  =  f  is  solved  by  the  convolution  u  =  &  *  f,  at 
least  for  /  e  C^t(R").  To  check  this,  we  need  to  know  how  to  evaluate  derivatives 
of  the  convolution. 

Lemma  12.6  For  w  e  V'(Rn)  and  p  e  C^t(Mw), 

Da(w  *  /)  =  ( Daw )  *  p  =  w  *  ( Daf ). 

Proof  For  p,p  e  C^t(£2),  we  compute  directly  that 

£>“(</>  *  ip)(x)  =  [  <p(x  -  y)ip(y)  dny 

J  Q 

=  [  Da<p(x  —  y)ip(y)  dny 
J  £2 

=  (Dap)  *  p(x). 

Since  the  convolution  is  symmetric,  the  same  formula  holds  with  p  and  p  switched. 
Thus  the  formula 

Da(p  *  p)  =  (Dap)  *  p>  =  p  *  (. Dap ) 

holds  for  test  functions. 

For  w  e  and  p  e  C^t(Mw),  it  follows  from  the  definitions  that 

(D“(/  *  </,),  vo  =  (-i r  *  (Dai>)). 

By  (12.31)  this  gives 

(D“(/  *  0),  VO  =  (-1)W(/,  Da{<F  *  VO) 

=  (£>"/,  <p~  *  V) 

—  ((Daf)  *  <j>,  VO, 


(12.31) 
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and  also 

(Da(f  *  <j> ),  vo  =  (-1  )w(/,  (Da4>y  *  V’)) 

=  (f  *  (Da4>),  ip). 

□ 

Theorem  12.7  If  L  is  a  constant  coefficient  operator  on  R”  with  fundamental  solu¬ 
tion  0,  then  for  f  e  C^t(M77)  the  equation 

Lu  =  f 

is  solved  by 

u  =  0  *  /. 

Proof  By  Lemma  12.6, 

L(0*f)  =  X  aaDa(0*f) 

\a\  <m 

=  Y,  aa(Da<P)  *  f 

\a\<m 

=  (L0)  *  /. 

Note  that  the  second  step  only  works  because  the  coefficients  aa  are  assumed  to  be 
constant.  Since  L0  =  S,  we  see  from  (12.30)  that 


L(0  */)  =  /. 


□ 


A  result  called  the  Malgrange-Ehrenpreis  theorem,  proven  in  the  1950s,  says  that 
every  constant  coefficient  differential  operator  on  M77  admits  a  fundamental  solution. 
The  fundamental  solution  of  the  Laplacian,  which  we  will  now  work  out  for  any 
dimension,  is  the  most  important  case. 

Theorem  12.8  On  M77  the  operator  —  A  has  the  fundamental  solution 


0(X)  = 


1 

(n—2)Anrn~2  ’ 


n  =  2, 
n  >  3, 


where  An  denotes  the  volume  of  the  unit  sphere  in  dimension  n. 
Proof  We  start  from  the  distributional  derivative, 


(12.32) 


(—A0,  VO  =  -(<£,  AVO 


252 


12  Distributions 


for  p)  e  C^t(Mw).  To  evaluate  this,  it  is  useful  to  first  compute  the  gradient, 


V0(x)  =  - 


Anrn 


The  function  x/rn  is  locally  integrable  in  M77  and  p)  has  compactly  support.  Therefore 
we  can  deduce  from  Green’s  first  identity  (Theorem  2.10)  that 

0  Aip  dnx  =  -  /  V0  •  S/p)  dnx 

JRn 

1  t  X  -n 


An  ./ran  rn 


Vp)  dnx 


1 


A 


n 


1  dp) 

dr 


•  n  —  1 


dnx, 


The  integral  can  be  evaluated  using  radial  coordinates  as  in  (2.10): 


0A  p)dnx  = 


1 


•oo 


dp) 


An  J Sn_1  JO 

1 


An  J  gn-i 

=  -^(0) 


dr 
p)(0)  dS 


dr  dS 


This  shows  that 


(-A0,  p))  =  p)(0), 


hence  —  A0  =  5. 


□ 


12.5  Green’s  Functions 

Although  fundamental  solutions  are  defined  only  for  the  domain  M77,  one  of  their 
principle  applications  is  to  boundary  value  problems  on  a  bounded  domain  Q  C  M77 . 
The  connection  comes  from  a  integral  formula  introduced  in  1828  by  George  Green. 

For  this  section,  let  0  denote  the  fundamental  solution  of  the  Laplacian  on  W1 , 
as  given  by  (12.32).  For  y  e  M77  we  set 


0y(x)  :=  0{x  -  y).  (12.33) 

Theorem  12.9  (Green’s  representation  formula)  Suppose  that  Q  C  M77  is  a  bounded 
domain  with  piecewise  C1  boundary.  For  u  e  C2(£2), 
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u(y)  = 


<PyAu  dnx  + 


du 

do 


—  u 


do 


dS 


for  y  g  £2. 

Proof  Because  the  point  y  e  £2  is  fixed,  for  notational  convenience  we  can  change 
variables  to  assume  y  =  0.  For  e  >  0  set 


B£  :=B(0;e), 


and  assume  that  e  is  small  enough  thatZ?£  c  £2. 

On  Q  —  B£ ,  0  is  smooth  and  satisfies  A@  =  0.  Therefore,  applying  Green’s 
second  identity  (Theorem  2.11)  on  this  domain  with  v  =  0  gives 


0  A  u  dnx 


n-B 


dS. 

(12.34) 


Because  Aw  is  continuous  and  0  is  locally  integrable, 


lim  /  _0A udnx=  /  0Audnx, 

£^°Jc2-B£  J  £2 


To  prove  the  representation  formula  we  must  therefore  show  that 


fim  [  ( 


) 


du  80  . 

0— - )  dS  =  u( 0), 

dr  dr 


(12.35) 


To  handle  the  first  term  in  (12.35),  note  that 


du 

dr 


<  |Vw|, 


for  r  >  0.  Therefore,  since  Vw  is  continuous  by  assumption,  we  have  a  bound 


max 

<95, 


du 

dr 


<  M 


for  e  >  0,  with  M  independent  of  e.  Using  the  fact  that  vol (dB£)  =  Anen  1  and  the 
formula  (12.32)  for  0,  we  can  estimate 


This  shows  that 


f  du 

/  0  dS 

<  m\ 

JdBe  dr 

£ln£, 

g 

n— 2  ’ 


n  =  2, 
n  >  3. 


f  du 

lim  /  0—dS  =  0. 

JdBe  or 
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For  the  second  term  in  (12.35),  we  use  the  fact  that 

d<P  _  1 

dr  Anrn~l  9 


for  r  >  0,  to  compute 


d&  1 

u——  dS  = 

8bf  dr 


u  dS. 


dBf 


The  right-hand  side  is  the  average  value  of  u  over  the  sphere  dBe.  By  continuity, 


1 


lim  — 

vol (dBe)  JdBc 


u  dS  =  u( 0) 


This  proves  (12.35),  and  thus  establishes  the  representation  formula.  □ 

The  representation  formula  of  Theorem  12.9  has  many  applications.  The  original 
goal  that  Green  had  in  mind  was  a  solution  formula  for  the  Poisson  problem  with 
inhomogeneous  Dirichlet  boundary  conditions,  which  we  will  now  describe. 
Suppose  there  exists  a  family  of  functions  Hy  e  C2(G),  for  y  e  T2,  satisfying 


A  Hy  =  0, 


an* 


Then  the  Green’s  function  of  Q  is 


Gy  . -  0y  Hy. 


(12.36) 


(12.37) 


It  is  possible  to  show  that  Hy  exists  under  general  regularity  conditions  on  dQ,  but 
this  is  too  technical  for  us  to  get  into  here.  We  will  focus  on  cases  where  Hy  can  be 
computed  explicitly,  which  requires  the  geometry  of  Q  to  be  very  simple. 

Theorem  12.10  Suppose  Q  C  M72  is  a  bounded  domain  with  piecewise  C1  boundary 
that  admits  a  Green’s  function  Gy,  Then  the  Poisson  problem  on  Q, 


-A u  —  /,  u  QQ=  g, 


for  f  e  C°(T2),  g  e  Cu(< 9^2),  is  solved  by  the  function 


’0 


r 

u(y)  =  -  I  fGy  dnx  —  /  g— 

'a  JdQ  dv 


y 


dS . 


Proof  Setting  v  =  Hv  in  Green’s  second  identity  (Theorem  2.11)  gives 


HyAu  —  uAHy 


} 


dny  = 


du 

Hy—~  dS. 
du 


(12.38) 
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The  result  then  follows  by  subtracting  (12.38)  from  the  representation  formula  of 
Theorem  12.9.  □ 

Example  12.11  The  Green’s  function  for  the  unit  disk  Del2  can  be  derived  using 
a  trick  from  electrostatics  called  the  method  of  images.  This  involves  placing  charges 
outside  the  domain  in  order  to  solve  the  boundary  value  problem.  For  the  unit  disk, 
in  order  to  find  Hy  we  consider  a  charge  placed  at  the  pointy  given  by  “reflecting” 
y  e  C\{0}  across  the  unit  circle,  i.e., 


(12.39) 


Note  that  @y  is  harmonic  on  O  because  y  £  ED. 

For  x  e  50,  let  p  denote  the  angle  from  y  to  x,  as  shown  in  Fig.  12.2.  By  the  law 
of  cosines  on  the  triangle  made  by  0,  y  and  x, 

\x  -  y\2  =  l  +  \y\2  -  2\y\  cos  (p. 


If  y  is  replaced  by  y,  the  corresponding  formula  is 

\x  -  y\2  =  1  +  \y\2  -  2\y\ cos  <p 

=  1  +  \y\~2  ~  2|y|-1  cos  (p. 


Solving  for  cos  <p  in  these  expressions  gives  the  relation 

\x  -  r/|2  -  1  -  M2  _  \x_  -yg_  -  1  -  \y\p_ 
2\y\  ~  2\y\~l 


which  simplifies  to 

lx  —  y 

\x-y\  = 

\y\ 


(12.40) 


forx  g  50. 


Fig.  12.2  Geometry  for  the 
method  of  images  on  the  unit 
disk 


y 
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Since  @(x)  =  —  In  |x|  in  M2,  taking  the  logarithm  of  (12.40)  gives 

1 

0y(x)  =  <Py(x)  +  —  ln\y\ 

Z7T 

for  y  7^  0  or  x.  Thus  we  can  solve  (12.36)  for  y  ^  0  by  setting 

1 

-  ^-ln|y|. 

2tt 

0  because  0  |^d  =  0. 

(l*-yMvl)  ’  y^° 

1*1,  y  =  0. 

To  apply  this  in  the  solution  formula,  we  need  the  radial  derivative  of  G.  For  y  fixed 
and  r  :=  |x|  we  compute 


Hy  0y 


For  y  =  0  the  obvious  solution  is  Hq  := 
The  Green’s  function  is  thus 


Gy(x )  = 


-^ln 

-^ln 


d 

— -  In  \x  —  y  |  =  x  •  V  In  | x  —  y 
or 


(X  -  y ) 

*  ■  ] - n 

k  -vV 

1  -x-y 
|x  -  y I2 " 


Applying  the  corresponding  result  for  |x  —  y  \  and  subtracting  gives 


dGy 

dr 


1 

2tt 


( 


1  -x-y 

\x  -  y I2 


1  -x-y 
\x  —  y  |2 


which  by  (12.40)  simplifies  to 


dGy 

dr 


(12.41) 


Conveniently,  the  calculation  for  y  =  0  leads  to  the  same  expression. 

In  the  case  of  the  Laplace  equation  on  ED,  Theorem  12.10  gives  the  formula  for  a 
harmonic  function  u  with  boundary  value  g  as 


u(y)  = 


±_  f  i-ii/i2 
2tt  Jm  \x  -  y |2 


g(x)  dS(x). 


This  will  look  more  familiar  in  polar  coordinates.  With  y  =  (r  cos  0,  r  sin  6)  and 
x  =  (cos  T),  sin  rj),  the  formula  becomes 
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u(r,  0)  = 


1 


•27 r 


1-r 


2?r  Jo  1  +  r2  —  2r  cos(7?  —  0) 


g(cos?7,  sin  77)  Jt?. 


This  is  the  classical  Poisson  formula  (9.4)  that  we  derived  from  Fourier  series.  0 


12.6  Time-Dependent  Fundamental  Solutions 

To  adapt  the  concept  of  a  fundamental  solution  to  evolution  equations,  we  need 
to  consider  time-dependent  distributions  on  W1 .  We  will  use  a  subscript  to  denote 
the  time  dependence,  to  avoid  confusion  with  the  spatial  variables.  Thus  a  map 
R  V'  (M72)  will  be  written 

t  i->  wt. 


For  ip  e  C^t(R”)  the  pairing  (wt,  ip)  is  a  complex- valued  function  of  t. 

The  function  t  i->  wt  is  differentiable  with  respect  to  time  if  there  exists  a  family 
of  distributions  ^  e  V'iW1)  such  that 


(12.42) 


for  all  ip  e  C^t(^2).  Higher  derivatives  are  defined  in  the  same  way. 


Example  12.12  In  R,  consider  the  derivatives  of  St,  the  delta  function  supported  at 
t.  By  definition, 

(5,,  VO  := 


so  that 


&■*) 


St,  ip  I  =  V,(n)(0 


Compare  this  to  the  spatial  derivatives,  defined  according  to  (12.20), 


=  (-1  )ny(n\t). 


We  conclude  that 


dn  dn 

6t  =  (-l)n—5t. 


dt 


n 


dx 


n 


(12.43) 


0 


Let  us  try  to  deduce  the  fundamental  solution  for  the  one-dimensional  wave  equa¬ 
tion  from  d’Alembert’s  formula  (4.8), 
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u 


1  1  fx+t 

0 t ,  X )  =  ~[g(x  +  t)+  g(x  -t) ]  +  -  /  A(r)  Jr, 

X  —  t 


(12.44) 


The  second  term  in  (12.44)  could  be  interpreted  as  a  convolution 

'X+t  POO 

h(r)  dr  —  Xl-t,t](x  ~  r)h(r)  dr 


x—t 


—oo 


=  X[-u]*h(x), 


where  xi  denotes  a  characteristic  function  as  in  (7.5).  Therefore  it  makes  sense  to 
define  this  component  of  the  fundamental  solution  as 


1 

The  time  derivatives  of  Wt  are  computed  from  the  pairing 


(12.45) 


(Wt,  VO  =  ^  f  V7  dx 


for  V>  G  C^t(M).  By  the  fundamental  theorem  of  calculus, 


2-(W,,  ip)  =  l[^(r)  +  ip(-t)\. 


(12.46) 


which  shows  that 


dt  2k  1 


Differentiating  again  using  (12.43)  gives 


d2Wt  1 


(12.47) 


(12.48) 


On  the  other  hand,  v -derivatives  of  Wt  are  defined  by  (12.20).  In  particular, 


(SH:=  ^ 


This  can  be  evaluated  by  direct  integration, 
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This  shows  that 


d2W, 
dx 2 


2  +  S-t) 


(12.49) 


By  (12.48)  and  (12.49),  Wt  is  a  distributional  solution  of  the  wave  equation, 


( 


d 2 
dfi 


In  contrast  to  the  definition  (12.27)  of  a  fundamental  solution  in  the  spatial  case,  Wt 
satisfies  a  homogeneous  equation.  The  delta  function  appears  only  in  the  boundary 
conditions, 


Wo  =  0, 


dwt 

dt 


=  8. 

t= o 


The  distribution  Wt ,  which  is  analogous  to  a  fundamental  solution,  is  called  the 
wave  kernel.  By  (12.47),  the  g  component  of  the  d’Alembert  solution  formula  (12.44) 
could  be  written  in  terms  of  Wt  as 


1 

2 


[g(x  +t)+  g(x  -  o] 


dW, 

dt 


*g(x). 


Thus  the  full  convolution  formula  for  the  solution  reads 


u(t,  •)  = 


dW, 

dt 


*  g  +  Wt  *  h. 


12.7  Exercises 


12.1  Dehne  the  distribution  u  e  'D'(M)  by 


(m,  ip) 


xp(x)  -  ip(0)  dx  +  f 

%  J \x\>l 


ip(x) 

X 


dx. 


Show  that  u  =  PV[x  1]. 

12.2  Let  /  g  L11oc(M)  be  the  function 


fix)  = 


log*, 

-log(-v), 


v  >  0, 

v  <  0. 


For  x  /0,  f\x)  =  |*|  1 ,  but  this  is  not  locally  integrable.  Show  that  the  distribu¬ 
tional  derivative  is 


260 


12  Distributions 


(/'.  VO 


VO*)  -  V'(O) 


-1 


VO*) 

\x\ 


dx. 


12.3  Let  HI  denote  the  upper  half-plane  {X2  >  0}  c  M2.  The  goal  of  this  problem  is 
to  show  that  the  Laplace  equation  on  HI, 


A u  =  0,  u( •,  0)  =  g. 


has  the  solution 


u(y)  = 


J/2 _ 

yi)2  +  v\ 


g(x)  dx 


for  geC”(R)- 


(a)  Derive  this  formula  from  Theorem  12.10  using  the  method  of  images  as  in 
Example  12.11.  In  this  case  the  reflection  of  y  e  HI  is  given  by  (y i,  7/2)  = 
(y  1,  —2/2)  (the  complex  conjugate). 

(b)  Show  that  the  fact  that  u(-,  0)  =  g  could  also  be  derived  by  using  Lemma  12.1 


to  deduce  that 


r  y 

lim  ——r - — 

7 T(XZ  +  yz) 


12.4  In  M3  show  that 

ikr 

(—A  —  k2) - =  <5 

471T 


for  all  IgM. 

12.5  For  n  >  3  let  B77  denote  the  unit  ball  {r  <  1 }  C  M77 . 

(a)  Apply  the  method  of  images  as  in  Example  12.11  to  derive  the  solution  Hy  of 
(12.36)  for  y  e  B77.  and  compute  the  Green’s  function.  (Note  that  the  formulas 
(12.39)  and  (12.40)  remain  valid  in  any  dimension.) 

(b)  Show  that  the  radial  derivative  of  the  Green’s  function  satisfies 

9GV, x )  l~\ 

dr  An\x  —  y\n~l 

(c)  Find  the  resulting  solution  formula  from  Theorem  12. 10,  and  show  that  this  gen¬ 
eralizes  the  mean  value  formula  for  harmonic  functions  obtained  in  Theorem  9.3. 


Chapter  13 

The  Fourier  Transform 


For  a  bounded  domain  Q  c  R”,  Theorem  11.7  shows  that  we  can  effectively 
“diagonalize”  the  Laplacian  by  choosing  an  orthonormal  basis  for  L2(E2)  consisting 
of  eigenfunctions.  Such  a  result  is  not  possible  on  M77  itself;  the  Laplacian  has  no 
eigenfunctions  in  L2(R77). 

The  closest  analog  to  eigenfunctions  on  R77  are  the  spatial  components  of  the  plane 
wave  solutions  introduced  in  Exercise  4.8, 

4>t(x)  ■=  e‘ix, 


associated  to  a  frequency  vector  £  g  R77 .  These  functions  satisfy  a  convenient  dif¬ 
ferentiation  formula, 

Da^  =  (i£)a^, 


and  in  particular 


-Ae**  =  HI2  e** 


The  appropriate  generalization  of  the  Fourier  series  to  L2  (Rn)  is  an  integral  trans- 
form  based  on  these  plane  waves.  Although  the  technical  details  are  quite  different 
from  Fourier  series,  the  transform  serves  a  similar  purpose  in  that  it  exchanges  the 
roles  of  differentiation  and  multiplication. 


13.1  Fourier  Transform 

The  Fourier  transform  of  a  function  in  f  e  L^R")  is  a  function  of  the  frequency 
£  G  M77  defined  by 

/(!)  :=  f  e-*xf(x)dn x.  (13.1) 

J  R" 
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Note  that  the  integral  is  well  defined  by  the  integrability  of  /,  and  in  fact 


fiO 


< 


L1 


(13.2) 


for  all  £  g  R”.  Asa  map  the  transform  is  denoted  by 


To  develop  the  properties  of  the  Fourier  transform,  it  proves  convenient  to  intro¬ 
duce  a  particular  class  of  test  functions,  called  Schwartz  functions .  The  space  S  con¬ 
sists  of  smooth  functions  which,  along  with  all  derivatives,  decay  rapidly  at  infinity. 
The  precise  meaning  of  “rapid”  is  “faster  than  any  power  of  r.”  An  alternate  form  of 
this  definition  is 


<S(Mn)  :=  {/  e  C°°(M");  \\xaD/3f  ||  <  oo  for  all  a,  /?}  , 


(13.3) 


where  ||  •  || ^  is  the  sup  norm  introduced  in  Sect.  7.3. 

A  basic  example  of  a  Schwartz  function  is  a  Gaussian  function  of  the  form 


fix)  =  e 


-a\x\ 


with  a  >  0.  We  also  have 

C~  (R")  c  S(R"), 


because  compactly  supported  functions  obviously  satisfy  the  decay  requirement. 

As  in  the  discrete  case,  the  Fourier  transform  interchanges  the  operations  of  dif¬ 
ferentiation  and  multiplication  in  a  convenient  way.  For  this  statement,  we  let  D® 
and  D(£  denote  partial  derivatives  with  respect  to  x  or  respectively. 

Lemma  13.1  For  if  e  <S(R”), 


and 


=  (iOT(©, 


Proof  The  first  identity  follows  from  integration  by  parts. 


=  e  l^'x  Dfif{x)  dnx 


=  I  ip(x)(iD^)a(e  l^'x)dnx 
ip(x)(i$t)ae~l^'x  dnx 


(13.4) 

(13.5) 
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The  second  is  also  a  direct  computation, 


F[xaf](0=  [  e-iixxaf(x)dnx 

J  Rn 

=  [  (iD?)a(e~iix)f(x)dnx 

J  Rn 

=  (^r/(0- 

Pulling  the  differentiation  outside  the  integral  in  the  final  step  is  justified  by  the 
smoothness  and  decay  assumptions  on  □ 

In  Lemma  13.1  we  can  see  Schwartz’s  motivation  for  the  definition  of  S.  Under 
the  Fourier  transform,  smoothness  translates  to  rapid  decay,  and  vice  versa.  These 
properties  are  balanced  in  the  definition  of  S ,  which  leads  to  the  following  result. 

Lemma  13.2  The  Fourier  transform  T  maps  S(W2)  —>  <S(R”). 

/V 

Proof  Suppose  that  /  e  S.  In  order  to  show  that  /  is  Schwartz,  we  need  to  produce 
a  bound  on  the  function  ^ Da  f  for  each  a,  (3.  By  (13.4)  and  (13.5), 


£>?/(£)  —  *’lal+l^l  [  e  l^'xxaD^f(x)  dnx, 

Jr n 


(13.6) 


To  estimate,  we  set 


M 


N,a,/3 


(1  +  \x\ 2)NxaDpxf 


oo 


which  is  finite  by  the  definition  (13.3).  Because  (l  +  |v|2)  ^is  integrable  for  N 
sufficiently  large,  we  can  estimate  (13.6)  by 


eoam 


<  Mn,cx,P 


1 


(1  +  |v|2)^ 


dnx. 


The  right-hand  side  is  independent  of  £,  so  this  yields  the  required  estimate. 
Example  13.3  Consider  the  one-dimensional  Gaussian  function 


□ 


<p(x)  :=  e 


-ax 


for  a  >  0.  Note  that  p  satisfies  the  ODE 


dp 

dx 


=  —2  axp. 


Taking  the  Fourier  transform  of  both  sides  and  applying  Lemma  13.1  gives 


■  /•  A  • 

i£p  =  —2  ai 


.d$> 


dR 
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which  reduces  to 


£  - 
2a ^ 


Separating  variables  and  integrating  yields  the  solution 


0(0  =  0(O)e~?'4a 


To  fix  the  constant,  we  can  use  (2.19)  with  n  —  1  to  compute 


(13.7) 


Thus, 


0 


The  computation  from  Example  13.3  can  be  generalized  to  W1  by  factoring  the 
integrals, 


T 


(|)  =  /  dn 


X 


-n  (/ 

j_ i  V-oo 


dx 


) 


n 


n 

7  =  1 


7Le-ej/^ 


Thus 


T 


-a\x\ 


}  ®  -  (!)  • 


2  _ Itf  |2 


14174a 


(13.8) 


for  a  >  0. 

For  f,ge  S(Rn),  consider  the  integral 


f(x)e  ixyg(y)  dnx  dny. 


(13.9) 
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The  integrals  over  x  and  y  can  be  taken  in  either  order,  yielding  the  useful  identity: 


fgdnx=  /  fgdny 


(13.10) 


for  /,  g  e  <S(R"), 

Theorem  13.4  The  Fourier  transform  on  <S(R")  has  an  inverse  T~ 1  given  by 


/(*)  =  (2tt)-"  /  e^xf(Odnt 


(13.11) 


Proof  In  (13.10)  let  us  set  g  =  £  for  a  >  0,  By  (13.8),  this  implies 


(-)1  I  f(x)e~x2^a  dnx  =  [  f  (y)e~ay2  d"y. 

/  ./ICPn  ./iCPn 


(13.12) 


On  the  left-hand  side  we  can  use  the  same  argument  as  in  the  proof  of  Lemma  12.1 
to  show  that 


lim 

a — ^0 


(TV  [  f(x)e-x1^  d" 

'CL'  .1  ~N$n 


X 


=  7T2  lim  [  f(^/ax)e  xl^dnx 

0  JRn 

=  (27t)”/  (0). 


We  claim  that  the  corresponding  limit  on  the  right-hand  side  of  (13.12)  is 


lim  /  / (y)e  ay~  dny  = 

0  / mm 


f(y)  d"y. 


(13.13) 


Because  the  convergence  is  not  uniform,  we  will  check  this  carefully.  The  difference 
of  the  two  sides  can  be  estimated  by 


f(y)e-av  dny  —  /  f(y)dny 


< 


f(y)  |  ( 


1  —  e 


■) 


ay  \  An 


d  y. 


Given  e  >  0  we  can  choose  R  large  enough  that 


\x\>R 


f(y) 


dny  <  s. 


since  /  is  integrable.  Splitting  the  integral  at  |x|  =  R  gives  the  estimate 

f(y) I  (l  -  e-aA  dny<e+  f  f  (y) I  (l  -  dny 

V  7  JB(0:R)  1  V  7 
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The  second  term  approaches  zero  as  a  ->  0,  so  that 

f(y)  |  (l  -  e~ay1^  dny  <  2e, 

for  a  sufficiently  small.  This  establishes  (13.13). 

By  these  calculations,  the  limit  of  (13.12)  as  a  — >►  0  yields 

(27r)B/(0)=  [  f(x)dnx .  (13.14) 

J  Rn 

This  is  a  special  case  of  the  desired  formula. 

The  general  inverse  formula  can  be  deduced  from  (13.14)  by  a  simple  translation 
argument.  For  w  el”,  define  the  translation  operator  Tw  on  S(Rn)  by 


Twf(y)  :=  fiy  +  w). 


A  change  of  variables  shows  that 


Twf(x)=  e  ix  y  f  (y  +  w)  dny 

J  Rn 

=  I  e-ix(y~m)f(y)dny 

JRn 

=  eixw  f(x). 


Since  Twf( 0)  =  f(w),  plugging  Twf  into  (13.14)  gives 


(2ir)nf(w)=  /  e'xwf{x)dnx 


□ 

The  pairing  formula  (13.10)  suggests  that  the  L2  inner  product  will  behave  natu- 
rally  under  the  Fourier  transform.  Indeed,  by  Theorem  13.4  we  can  compute 

fiOW)  dni  =  [  (  [  f{x)e~ix '«  d"x)  W)  dni 

J  R"  \jRn  / 


?n  \  /  EPn 


eixm£)  dn€Jf(x)d"x 


=  (2tt)"  /  f(x)g(x)  dnx, 


for  f,g<E  S(Rn).  In  other  words, 


(f,g)  =  (27T)n(f,g) 


(13.15) 
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The  integral  (13.1)  defining  the  Fourier  transform  does  not  necessarily  converge 
for  /  g  L2(M"),  but  the  identity  (13.15)  makes  it  possible  to  define  transforms  on 
L2  by  taking  limits. 

Theorem  13.5  (Plancherel’s  theorem)  The  Fourier  transform  extends  from  <S(R”) 
to  an  invertible  map  on  L2(M,n),  such  that  (13.15)  holds  for  all  f,g£  L2(Rn). 

Proof  First  note  that  Theorem  7.5  implies  that  <S(R")  is  dense  in  L2(M")  because 
it  includes  the  compactly  supported  smooth  functions.  Hence  for  /  g  L2(Mw)  there 
exists  a  sequence  of  Schwartz  functions  0*  — >  /  in  L2.  As  a  convergent  sequence, 
{0*}  is  automatically  Cauchy,  i.e., 

lim  ||0*  -  (f)m  || 2  =  0. 

k,m^oo 


By  (13.15), 


=  (27r)"/2  \\<j>k  -  (j)m  || 2 


/V  _  _ 

implying  that  {0*}  is  also  Cauchy  in  L2(M").  Since  L2(Mn)  is  complete  by  Theorem 
7.7,  this  implies  convergence,  and  we  can  then  define 


/  :=  lim  4>k, 

k — >oo 

with  the  limit  taken  in  the  L 2  sense. 

To  show  that  (13.15)  extends  to  L2,  suppose  for  f,geL 2  that  0*  — >  /  and  fm 
g  are  approximating  sequences  of  Schwartz  functions.  By  the  property  (13.15), 


(0/:?  fm)  —  (27r)  (0*,  0m)* 

Taking  the  limit  k,  m  — >  oo  then  gives 

(f,g)  =  (2'K)n(f,g). 

/V 

The  same  argument  can  be  used  to  show  that  /  is  independent  of  the  choice  of 
approximating  sequence.  □ 


13.2  Tempered  Distributions 

Since  T  maps  <S(R")  to  itself,  to  extend  the  Fourier  transform  to  distributions  it  is 
natural  replace  C^t(R”)  by  <S(R")  as  the  space  of  test  functions.  The  result  is  the 
space  of  tempered  distributions 

S'(W)  :=  {continuous  linear  functionals  S(W)  — >  C}  . 
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Here  the  word  “tempered”  refers  to  a  restriction  on  the  growth  at  infinity.  Because  the 
Schwartz  functions  decay  rapidly,  a  locally  integrable  function  is  essentially  required 
to  have  a  polynomial  growth  rate  at  infinity  in  order  to  define  an  element  of  S'(Rn). 

The  definition  of  continuity  of  a  functional  on  <S(R")  depends  on  a  notion  of 
convergence  for  Schwartz  functions.  A  sequence  {fa}  C  S(Rn)  converges  if  the 
sequences  { xaD@fa}  converge  uniformly  for  each  a ,  /3 .  To  say  that  u  e  S'(Rn)  is 
continuous  means  that  (u,  fa)  (w,  fa  whenever  fa  ->  ip  in  cS(M77). 

The  delta  function  Sx  and  its  derivatives  are  clearly  tempered  distributions.  We 
claim  also  that 

Lp(Rn)  c  S'(R"), 

for  /?  e  [1,  oo].  This  follows  fairly  directly  from  the  fact  that  S(Rn)  C  LP(MW)  for 
P  e  [1,  oo]. 

The  pairing  formula  (13.10)  gives  the  prescription  for  extending  T7  to  the  tempered 
distributions.  For  u  e  S'(Rn),  we  define  u  by 


(u ,  <p)  :=  (u ,  <p)  (13.16) 

for  (j>  e  <S(R").  To  justify  this  definition  one  needs  to  check  that  the  Fourier  trans¬ 
form  is  continuous  as  a  map  <S(R")  — >  S(Rn).  This  essentially  follows  from  the 
calculations  in  the  proof  of  Lemma  13.2. 

As  an  example,  consider  the  function  u  —  1  as  an  element  of  5'(R”).  For  ip  e 
S(Rn ), 


(1,^)  :=  (1,  VO 


=  /  ,ip(x)dnx. 


According  to  the  inverse  Fourier  transform  formula  (13.11), 


?V(x)  J,7x  =  (27r)7?V;(0). 


Therefore 

1  =  (2n)nS.  (13.17) 

Physicists  often  express  this  fact  by  writing 

<5(jc)  =  (27 r)_"  [  e~ix  i  dn£, 

J  Rn 

with  the  understanding  that  the  integral  on  the  right  is  not  to  be  taken  literally. 

The  Fourier  transform  of  5  is  a  similar  calculation.  For  ip  e  S, 
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or  vo  ■=  or  vo 

=  ^(0) 


=  /  ip(x)dnx . 


(13.18) 


Therefore 

<5=1.  (13.19) 

Because  differentiation  and  multiplication  by  polynomials  are  continuous  opera¬ 
tions  on  <S(R"),  they  extend  to  tempered  distributions.  From  Lemma  13.1  we  imme¬ 
diately  derive  the  following: 

Lemma  13.6  Foru  e  <S'(R"), 


F[D*u\  =  (i£)au. 


and 

T\xau]  =  (iD)^u. 

The  Fourier  transform  on  S'(R")  is  particularly  useful  in  the  construction  of 
fundamental  solutions.  Consider  the  constant  coefficient  operator 

L  =  X 

|  CM  |  <772 


with  aa  e  C.  According  to  Lemma  13.6  and  (13.19),  the  Fourier  transform  of  the 
equation 

L&  =5 
is 

P(0<t>  =  l. 

where 

P(0  ■■=  ^  aa(iOa. 

|  CM  |  <772 


If  the  reciprocal  of  P(0  makes  sense  as  a  tempered  distribution  then  we  can  set 
0(0  =  l/P(0  and  take  the  inverse  Fourier  transform  to  construct  a  fundamental 
solution  0  as  an  element  of  «S'(R"). 

r\ 

Example  13.7  The  polynomial  corresponding  to  —A  on  R”  is  P(0  =  \0  .  For 
n  >  3  the  function  \0~  is  locally  integrable  and  decays  at  infinity,  so  this  defines  a 
tempered  distribution.  Hence  we  should  be  able  to  compute  0  as  the  inverse  Fourier 
transform  of  |^|_  . 
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_ r\ 

Because  |£|-  is  not  globally  integrable,  we  cannot  apply  the  formula  (13.11) 
directly.  A  trick  to  get  around  this  is  based  on  the  fact  that 


r\ 

for  a  >  0.  Setting  a  =  l4l  gives 


for  £  7^  0.  We  can  pair  both  sides  with  a  Schwartz  function  'ipiO  and  integrate  to 
show  that 


[icr2] 


(13.20) 


Setting  a  =  1/(4 1)  in  (13.8)  gives 


(x)  =  (47rO_|e“|j:|2/4r, 


so  that  (13.20)  reduces  to 


roo 

T~ 1  [Id-2]  =  /  (4tt  t)-nie-]x]2/4,dt. 

Jo 


To  evaluate  the  integral  we  substitute  s  =  \x\  /4 1  to  obtain 


^  m  -  r  (^) 


~  _  a  ~ 

2  \  2  I  V|2 

'’_5  ds 


4s2 


1 

=  — 7T  2  \X 

4 


•oo 


l2“"  /  s'i-2e~sds. 


r0 


In  terms  of  the  gamma  function  (2.17)  this  calculation  gives  the  fundamental  solution 


&(x)  =  ^n-'irq-l)\x 


i2  —n 


(13.21) 


This  agrees  with  the  formula  for  0  from  Theorem  12.8,  because 


and 


0 
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Following  the  discussion  in  Sect.  12.6,  we  can  try  to  define  the  wave  kernel  Wt  on 
W1  by  solving  the  distributional  equations 


( 


d‘ 


dt2 


) 


—  A  )W,  =  0,  Wo  =  0, 


dW, 

dt 


=  5. 


t=  o 


(13.22) 


If  we  assume  that  Wt  e  S'(W‘),  then  the  spatial  Fourier  transform  allows  us  to 
analyze  this  equation  by  turning  it  into  a  simple  ODE. 

/V 

For  each  t  define  Wt  e  S'(Rn)  by  the  (spatial)  distributional  transform  (13.16). 
By  Lemma  13.6  and  (13.19),  (13.22)  transforms  to 


( 


d 2 
dfi 


Wo  =  0, 


dWt 

dt 


=  1. 

t= o 


The  unique  solution  to  this  ODE  is 


W,(©  = 


sin(rl^l) 
1^1  ’ 
t, 


I  7^0, 


(13.23) 


The  function  Wt  is  smooth  and  bounded,  and  therefore  defines  a  tempered  distribution 
on  W1.  The  inverse  Fourier  transform  Wt  e  5'(M")  thus  yields  a  general  solution 
formula  for  the  wave  equation  on  R”.  For  initial  conditions  g,  h  e  <S(R"), 

8Wt 

u(t,-)  =  —±*g  +  W,*h.  (13.24) 

at 

The  direct  computation  of  the  inverse  Fourier  transform  of  (13.23)  is  rather  tricky, 
but  we  can  check  this  formula  against  the  results  we  already  know.  For  n  =  1  we 
have  Wt  =  \x\-t,t\  from  the  d’Alembert  formula.  Since  this  is  integrable  the  Fourier 
transform  can  be  computed  directly: 


Wt(0  = 


1 


,x?  dx 


1 

2 


e~ixt  dx 


sin  (t£) 

I”  ’ 
t. 


e  =  o. 


For  n  —  3,  the  Kirchhoff  formula  from  Theorem  4.10  shows  that  the  wave  kernel 
is  the  distribution  defined  by 
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W,V0  := 


1 


47 Tt 


dB(0-,t) 


pj  dS, 


for  pj  e  <S(M3).  By  definition,  the  Fourier  transform  is  given  by 

(wt9p;)  '=  ~~  f  p>(x)dS(x) 

\  /  47 Tt  JdB(P\t ) 


j—  [  (  [  e  lx  ii’(Od3p)  dS(x). 

4tt?  JdB(0-t)  \Jr 3  / 


Since  ip(y)  has  rapid  decay  as  y  ->  oo  and  the  x  integral  is  restricted  to  a  sphere, 
we  can  switch  the  order  of  integration  and  conclude  that 


wt(0  = 


l 


4-TTt 


e  lx dS(x), 


dB(0;t) 


To  compute  this  surface  integral,  note  that  we  could  rotate  the  x  coordinate  without 
changing  the  result  of  the  integration.  It  therefore  suffices  to  consider  the  case  where 
£  is  parallel  to  the  X3  axis.  If  we  then  use  the  spherical  coordinates  (r,  6,  <p)  for  the 
x  variables,  this  gives 

X  ■£  =  |£|  r  cos 


For  the  surface  integral  at  radius  r  =  t, 


r\ 

dS(x)  =  t  sin  p  dp  d6 . 


The  Fourier  transform  is  thus 


wt(0  = 


1 


‘2tT  p  tt 


47 Tt 


e-it\£\cose  t2  sirup  dtp  d6 


ro  J  0 


‘TT 

e~it\€\cos6  s’n  0 


With  the  substitution  u  =  cos  d>  this  becomes 


Wt(0  :=  -  /  e~mu  du 

2  J- 1 

^0, 

t ,  £  =  0. 


Hence  the  Kirchhoff  formula  agrees  with  the  transform  solution  (13.23). 
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13.4  The  Heat  Kernel 


By  analogy  with  (13.22),  the  heat  kernel  H,  is  defined  as  the  solution  of  the  distrib¬ 
utional  equation 

9  -  A  )  H,  =  0,  H0  =  6.  (13.25) 


(1-4 


Assuming  Ht  e  S'  (M77),  let  Ht  denote  the  spatial  Fourier  transform  of  Ht.  By  Lemma 
13.6  and  (13.19),  (13.25)  transforms  to 


( 


d 

dt 


') 


+  |£|z  \H,  =  0, 


H0=  1. 


This  simple  ODE  has  the  unique  solution 


Ht(0  =  e~m\  (13.26) 

/V 

Because  Ht  is  a  Schwartz  function  for  t  >  0,  we  can  compute  the  inverse  Fourier 
transform  by  the  direct  integral  formula  (13.11),  which  gives 

H,{x)  =  (271-)“"  [  e^xe~m2  dn£. 

J  Rn 

According  to  (13.8),  this  inverse  transform  is 

Ht(x)  =  (47rf)-Je-w2/4r.  (13.27) 

In  Sect.  6.3  we  guessed  this  formula  from  a  calculation  in  the  one-dimensional  case. 
The  Fourier  transform  allows  for  a  systematic  derivation. 


13.5  Exercises 

13.1  Let  HI  c  M2  denote  the  upper  half  space  {X2  >  0}.  The  Poisson  kernel  on  HI  is 
the  distributional  solution  of  the  equation 

AP  =  0,  P\X2= o  =  6. 

/V 

(a)  Let  P  (£ ,  xi)  denote  the  distributional  Fourier  transform  of  P  with  respect  to  the 

/V 

x\  variable.  Find  the  corresponding  equation  for  P. 

/V 

(b)  Show  that  the  unique  solution  of  the  ODE  with  P(  ,x 2)  e  S'  (R)  is  the  function 

P(.Z,x2)  =e~xM]. 
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(c)  Compute  the  inverse  transform  to  show  that 


P(x)  = 


*2 


2\  ' 


7 r(xf  +  xf) 


(d)  For  /  e  <S(R),  use  P  to  write  an  integral  formula  for  the  solution  of  the  Laplace 
problem  on  H: 

Aw  =  0,  m  1*2=0  =  /■ 

13.2  For  f  e  <S(R),  the  Poisson  summation  formula  says  that 


oo 


oo 


k——oo 


m=— oo 


Derive  this  formula  using  the  steps  below. 


Define  a  periodic  function  /  e  C°°(T)  (where  T  :=  M/27tZ  as  in  Sect.  8.2)  by 

/V 

averaging  ip, 

oo 

fix)  ■■=  z  +  2nm). 

m=—o o 


Show  that 

c*[/]  = 


(b)  Obtain  the  summation  formula  by  comparing  /  to  its  Fourier  series  expansion 
at  v  =  0. 


13.3  Recall  that  the  heat  equation  on  T  was  solved  by  Fourier  series  in  Theorem 
8.13. 


Use  the  solution  formula  (8.44)  to  show  that  the  heat  kernel  on  T  is  given  by  the 
series 


h,{x) 


1 

2tt 


oo 

^  '  g—k2t+ikx 
k=—o o 


(b)  Use  the  Poisson  summation  formula  from  Exercise  1 3 .2  to  show  that  the  periodic 
heat  kernel  ht  and  the  heat  kernel  Ht  on  R  are  related  by  averaging 

oo 

ht(x)  =  z  Ht(x  +  271777 ) 

m=—o o 


for  t  >  0.  (Note  that  this  shows  ht(x )  >  0  for  all  v  e  T,  t  >  0,  which  is  not 
clear  in  the  formula  from  (a).) 


13.5  Exercises 


275 


13.4  The  Schrodinger  equation  on  Rn , 


du 


-i- - Au  =  0,  u\t=0  =  g, 

at 

was  introduced  in  Exercise  4.7. 

(a)  Assuming  that  g  e  S  (Mw ) ,  find  a  formula  for  the  spatial  Fourier  transform  u(t,  0 . 

(b)  Show  that  the  result  from  Exercise  4.7, 


for  all  t  >  0,  follows  from  the  Plancherel  theorem  (Theorem  13.5). 
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Appendix  A 

Analysis  Foundations 


In  this  section  we  will  develop  some  implications  of  the  completeness  axiom  for  R 
which  are  referenced  in  the  text. 

The  fundamental  result  from  which  the  others  follow  is  the  equivalence  of  com¬ 
pactness  and  sequential  compactness  for  subsets  of  R”.  Recall  from  Sect.  11.6  that 
a  set  A  is  sequentially  compact  if  every  sequence  within  A  contains  a  subsequence 
converging  to  a  limit  in  A.  The  equivalence  was  first  proven  by  Bernard  Bolzano  in 
the  early  19th  century,  and  later  rediscovered  by  Karl  Weierstrass. 


Theorem  A.l  (Bolzano- Weierstrass)  In  Wl  a  subset  is  sequentially  compact  if  and 
only  if  it  is  closed  and  bounded. 


Proof  If  A  C  R”  is  unbounded,  then  there  exists  a  sequence  of  points  {x7  }  C  A 
— >  oo.  Any  subsequence  has  the  same  property,  so  [x  j  }  has  no  convergent 


with 

subsequence.  If  A  is  not  closed,  then  there  is  some  w  £  A  which  is  a  boundary 
point  of  A.  Every  neighborhood  of  w  thus  includes  points  of  A,  so  there  exists  a 
sequence  [x j]  C  A  converging  to  w.  All  subsequences  of  {xy}  also  converge  to 
w ,  and  therefore  no  subsequence  converges  in  A.  We  conclude  that  a  sequentially 
compact  subset  of  R”  is  closed  and  bounded. 

For  the  converse  argument,  let  us  first  consider  the  one-dimensional  case.  Let 
{ xj }  be  a  sequence  in  a  bounded  set  A  c  R.  For  each  n  the  real  number 


bn  :=  supte;  k  >  n}  (A.l) 

exists  by  the  completeness  axiom.  The  sequence  {bn}  is  decreasing,  because  the 
supremum  is  taken  over  successively  smaller  sets,  and  also  bounded  by  the  hypothesis 
on  A.  Therefore  the  number 

a  :=  inf  bn 

ne  N 


is  well-defined  in  R.  The  fact  that  {bn}  is  decreasing  implies  bn  >  a  for  all  n. 

We  claim  that  a  subsequence  of  x >  converges  to  a.  For  this  purpose  it  suffices  to 
show  that  the  interval  (a  —  e,  a  +  s)  contains  infinitely  many  Xk  for  each  £  >  0. 
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If  this  were  not  the  case,  then  for  some  n  we  would  have  Xk  (a  —  e,  a  +  e)  for 
all  k  >  n.  This  would  imply  either  bn  <  a  —  s  or  bn  >  a  +  e,  both  of  which  are 
impossible  by  the  definition  of  a. 

This  proves  the  existence  of  a  subsequence  converging  to  a.  The  fact  that  A  is 
closed  implies  u  g  A,  so  this  completes  the  argument  that  a  closed  bounded  subset 
of  R  is  sequentially  compact. 

To  extend  this  argument  to  higher  dimensions,  consider  a  sequence  {x^}  in  a 
compact  subset  A  c  R77.  The  sequence  of  first  coordinates  of  the  x^  is  a  bounded 
sequence  in  R,  so  the  above  argument  yields  a  subsequence  such  that  the  first  coor¬ 
dinates  converge.  We  can  then  restrict  our  attention  to  this  subsequence  and  apply 
the  same  reasoning  to  the  second  coordinate,  and  so  on.  After  n  steps  this  procedure 
produces  a  subsequence  which  converges  to  an  element  of  A.  □ 

Bolzano  used  sequential  compactness  to  prove  the  following  result,  which  serves 
as  the  foundation  for  applications  of  calculus  to  optimization  problems. 

Theorem  A. 2  (Extreme  value  theorem)  For  a  compact  set  K  C  W1,  a  continuous 
function  K  ->  R  achieves  a  maximum  and  minimum  value  on  K. 


Proof  Assume  that  /  :  K  ->  R  is  continuous.  We  will  show  first  that  /  is  bounded. 
Suppose  there  is  a  sequence  x j  e  K  such  that  |/(x7-)  — >  oo.  By  Theorem  A.l, 
after  restricting  to  a  subsequence  if  necessary,  we  can  assume  that  x  j  — >  w  e 
K.  Continuity  implies  f(xj)  — >  f(w ),  but  this  is  impossible  if  \f(xj)  — >  oo. 
Therefore  a  continuous  function  on  K  is  bounded. 

Since  f(K)  is  a  bounded  subset  of  R ,  b  :=  sup  f(K)  exists  in  R  by  the  com¬ 
pleteness  axiom.  To  prove  that  /  achieves  a  maximum,  we  need  to  show  b  e  f(K). 
If  b  £  f(K )  then  the  function 


h(x) 


1 

b  ~  f(x) 


is  continuous  on  K ,  and  therefore  bounded  by  the  above  argument.  However,  h(x)  < 
M  for  x  6  K  would  imply  that  sup  f(K)  <  b  —  1/M,  contradicting  the  definition 
of  b.  Therefore  b  e  f(K),  so  /  achieves  a  maximum.  A  similar  argument  applies  to 
the  minimum.  □ 


The  final  result  is  the  completeness  of  R77  as  a  normed  vector  space,  as  noted  in 
Sect.  7.4. 

Theorem  A.3  In  R77  a  sequence  converges  if  and  only  if  it  is  Cauchy. 

Proof  We  have  already  noted  that  a  convergent  sequence  is  Cauchy  in  a  normed 
vector  space.  Suppose  that  {x^}  is  a  Cauchy  sequence  in  R77 .  This  implies  in  particular 
that  the  sequence  is  bounded.  Therefore,  by  Theorem  A.l,  there  exists  a  subsequence 
converging  to  some  w  e  R77 . 
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By  the  definition  of  Cauchy,  for  e  >  0  there  exists  N  sufficiently  large  such  that 


Xj  -xk 


<  £ 


for  all  j  ,k  >  N.  We  can  also  choose  an  element  xi  in  the  subsequence  such  that 
/  >  N  and 

\xi  —  w\  <  s. 


The  triangle  inequality  then  gives 


Xj-lV 


< 


Xj  -  Xi  +  I Xj-W 


<  2e. 


for  all  j  >  N.  Since  the  choice  of  e  was  arbitrary,  this  shows  that  the  full  sequence 
converges  to  w.  □ 
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